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FOREWORD 
ССК 
It is an amazing fact that our schools and colleges know 
little of the results of their work. It is even more amazing 
that they seldom attempt seriously to find out what changes 
schooling brings about in students. Ask any school what its 
objectives are and you will be told that it seeks to develop 
character, ability to think clearly, social responsibility, good 
health habits, readiness for earning a living, knowledge of 
certain facts and mastery of certain skills. Ask whether the 
school succeeds in doing these things, the answer is, “We 
know only in part.” Half of the boys and girls who begin the 
work of the secondary school drop out before completing it. 
Schools usually do not know why these students leave or 
what becomes of them immediately afterward. Few schools 
know even what their graduates are doing, what problems 
they are facing, or how well prepared they are to solve 
Шет, И 

How can this lack of knowledge and concern be ех- 
plained? There are doubtless many causes, but one of the 
Most obvious is the universal emphasis upon the accumula- 
tion of credits for promotion, graduation, and admission to 
college. To secure a credit or unit the student must “pass” 
a course he must remember certain facts 


a course. To pass 
ney in certain skills. Therefore, remember- 


and show proficie 
ing knowledge and practicing techniques for examinations 
become the purposes of education for pupils and teachers 
alike. What goes on the school record becomes the real 
Objective of the student, no matter what the school says its 
purposes are. If the pupil secures the required credits, he is 
graduated. The job is done. Concentration on these worthy 
but limited goals seems to make teachers and students for- 
get the larger, long-range purposes of education. 
ХУП 
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One of the major reasons for over-emphasis upon these 
limited objectives is that results in these fields are more 
easily measured than in other less tangible areas. There are 
many instruments of evaluation applicable to the conven- 
tional subjects of the curriculum. Much of the work of such 
organizations as the Educational Records Bureau, the Co- 
operative Test Service, and the College Entrance Examina- 
tion Board is of great value to schools and colleges. But 
most tests available when this Study began were measures 
chiefly of accretions of knowledge and proficiency in the 
use of skills. Because such tests are at hand the teacher uses 
them. Because instruments of appraisal in other areas have 
not been available, the teacher tends to neglect other objec- 
tives and to strive only for results that can be ascertained 
with relative ease and objectivity. 

It follows, then, that comprehensive appraising, record- 
ing, and reporting of results are matters of vital concern to 
those who seek improvement in the work of our schools and 
colleges. The Eight-Year Study has recognized the impor- 
tance of these aspects of school work. To assist the Thirty 
Schools in developing adequate programs of evaluation and 
reporting, committees and technical staffs were organized 
shortly after the Study began. The Commission was fortu- 
nate in securing the services of Eugene R. Smith and Ralph 
W. Tyler as leaders in this work. This volume reports in 
detail the steps that were taken to help the schools to dis- 
cover, record, and report the progress of students toward the 
whoie range of desired goals. 

The work reported here rests upon three basic convic- 
tions: first, that evaluation and recording should always be 
directly related to each school’s purposes; second, that any 
school’s evaluation program should be comprehensive, in- 
cluding appraisal of progress toward all the school’s major 
objectives; third, that teachers should participate in the con- 


struction of all instruments of evaluation 


and forms for 
records and reports. 
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А 

It is impossible to estimate the wastage of material and 
human resources which results from education’s ignorance 
of the consequences of its efforts. Until schools and colleges 
develop adequate, comprehensive appraising and recording 
programs, that waste will continue. Although no one con- 
nected with the Eight-Year Study would claim that its work 
in these fields is complete or entirely satisfactory, it is clear 
d in this volume points the way to fuller 


that what is reporte 
anding, and wiser guid- 


knowledge, more complete underst: 


ance of youth. 
Үүп.ғовр M. AIKIN 
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When the Directing Committee of the Commission on the 
Relation of School and College appointed a Committee on 
Records and Reports, it assigned to this new committee the 
general task of recommending methods of obtaining and 
recording information about the pupils. The immediate rea- 
son for this assignment was the need of supplying to the 
colleges data upon which they could decide about the ac- 
ceptability of candidates who did not present the traditional 
pattern of subjects for entrance or had not submitted the 
usual entrance information in terms of marks and examina- 
tions. A second important reason was the desire of schools 
for help in their guidance programs. 

The instructions given this committce specified as its first 
task the devising of methods of obtaining and recording in- 
formation about personality. Tt was necessary, however, 
from the beginning to try to find ways of testing that would 
neither determine nor depend upon the content of the 
the various schools, yet would be reason- 


Courses given in 
ctive measures of knowledge and 


ably comparable and obje 


power. 
The committee met with some frequency for periods of 


two or three days at a time. It soon announced to the 
Schools a list of comparable tests that seemed to have value 
for estimating the degree of mastery attained by pupils in 
Various subject fields. Many of the schools tried these tests, 
and*some added others from quite a wide selection of those 
of an objective type. It became apparent, however, that 
even these tests weré too much influenced by the content 
Studied to be acceptable to all of the schools. The reason 
was that the schools were anxious to use the utmost flex- 


XX1 


xxii PREFACE 


ibility in meeting the needs of their pupils even when that 
meant departing markedly from traditional subjects or their 
content. A period of experimentation followed, during 
which other work was accomplished. When it was recog- 
nized that no matter how valuable existing methods and 
material for testing might be for various purposes, never- 
theless they did not fit the need of the cooperating schools 
for testing that would measure the power attained, irre- 
spective of the way in which it had been reached, the Di- 
recting Committee obtained further funds and enlarged the 
branch responsible for testing, recording, and reporting. 
The final organization of this department was headed by 
an over-all committee called the Committee on Evaluation 
and Recording. It had responsibility for determining pol- 
icies, considering reports on work accomplished and giving 
direction about the next steps to be undertaken. Dr. Ralph 
W. Tyler was engaged as Research Director for this part of 
the Eight-Year Study, and was given as his particular assign- 
ment charge of the work on evaluation. This assignment 
included direction of the follow-up study of graduates of 


the cooperating schools who were attending college, as well 
as of the study of objectives and of the testing and other 
evaluation cz 


arried on in the schools. Under Dr. Tyler’s su- 
pervision the Evaluation Staff and a large number of com- 


mittees assisted in this part of the work. A detailed account 
1s given in Part I of this volume. 


The chairman of the Committee on Evaluation and Re- 
coráing was given charge of'the production of recording 
forms, and of methods of reporting to the colleges and to 
the homes, As a part of this work the original Committee ол 
Records and Reports, which had in the meantime published 
two editions of the "Behavior Description," described in 


Part п, was assigned the continued study of personal char- 
acteristics and their recording and reporting. 


Other committees whose members were chosen not only 
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from the cooperating schools but also ftom colleges and 
from schools and other groups not definitely concerned with 
the Eight-Year Study, worked on the various problems con- 
cerned with records and reports and were responsible for 
the forms devised. Of much importance also was the help 
given by the various members of the staff. The assistance of 
the Director of the Study, the Research Director, the Cur- 
riculum Assistants and the Members of the Evaluation Staff 
was available both indirectly, through the results of their 
studies, and directly by means of conferences and attend- 
ance at group meetings. Dr. John W. M. Rothney deserves 
special mention since he has been Research Assistant to all 
of these committees since the change in organization. 
While it is not possible to list the large number of those 
who took part in the work on evaluation and that on record- 
ing, the committee in charge of these activities wishes to 
express its appreciation of the contributions made by those 
who assisted. Without their self-sacrificing cooperation, little 


could have been accomplished. 
EucENE R. SMITH, 


Chairman 
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DEVELOPMENT AND USE OF EVALUATION 
INSTRUMENTS 
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Chapter I 


PURPOSES AND PROCEDURES OF THE 
EVALUATION STAFF 
«ехехехехекехехехескекесекесесескекесесескекесесесе 
How тне EVALUATION STAFF CAME INTO EXISTENCE 


The plan of the Eight-Year Study, as Dr. Smith explained 


in the Preface, placed upon the cooperating schools the re- 


sponsibility for reporting in some detail the characteristics 
ho were recommended for 


and achievements of students w 

admission to college. F urthermore, the Directing Committee 
of the Study expected the schools not only to record the 
Steps taken ‘to develop new educational programs, but also 
to appraise the effectiveness of these programs, so that other 
schools might benefit from their experience. 

After the first vear it became clear that these tasks were 
too great for them both to be assumed by the Committee on 
Records and Reports. The magnitude of the work had be- 
come evident when the Committee on Records and Reports 
reviewed the available tests, examinations, and other devices 
for appraising student achievement. Most of the achieve- 
ment tests then on the market measured only the amount of 
information which students remembered, or some of the 
more specific subject skills like those in algebra and* the 
foreign languages. The new courses developed in the Thirty 
Schools attempted to help students achieve several addi- 
tional qualities, such, as more effective study skills, more 
careful ways of thinking, a wider range of significant inter- 
ests, social rather than selfish attitudes. Hence, the available 
achievement tests did not provide measures of many of the 
more important achievements anticipated from these new 
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courses. Furthermore, the content of most significance in the 
new courses was frequently different from that which had 
been included before. Hence, the available tests of informa- 
tion did not really measure the information which students 
would be obtaining in the new courses. A comprehensive 
appraisal of the new educational programs could not be car- 
ried on unless new means of evaluating achievement were 
developed. 

The Directing Committee obtained a preliminary subsidy 
from the General Education Board to explore the possibility 
of constructing devices which could be used in appraising 
the outcomes of the new work. During the autumn of 1934, 
the Thirty Schools were visited, inter-school committees 
were formed, and preliminary steps taken to construct 
needed instruments of evaluation. Bv the winter of 1935 it 
seemed apparent that new instruments could be devised and 
that a more comprehensive program of appraisal could be 
conducted. Hence, a generous subsidy for the services of an 
evaluation staff! was provided bv the General Education 
Board, and the work was continued until the close of the 


! During the exploratory period, Oscar K. Buros, of Rutgers University, 
served as Associate Director. After helping to get the plan outlined, Mr. 
Buros resigned as Associate Director of the Evaluation Stalf and returned 
и и Univer . From July, 1935, until September, 1938, Mr. Louis 
m Y Served. q Associate Director. The Staff was then housed at the 

hio State University, When Mr. Tyler, the. Director. moved to the Uni- 
versity of Chicago in September, 1938, Mr. Maurice L 
Associate Director. Others who served ; d 
PAT OPE for pe е more years were: Herbert J. Abraham, Dwight L. 
t нЕ Walked точ Даа Бурак Block, Charles L. Boye, Paul 

T. ‘sen C cernart, Fred P. Frutchey, Paul R. Grim. Chester 
William Harris, Louis M. Heil, John H. Herrick, Clark а е Walter 


© MMe Det k^ dl Lauritsen, Christine McGuire, Harold 
Harold Trimble, Cerelia K. Wa assor, George V, Sheviakov, Hilda Taba, 


asserstrom, Kay D. Wats " /cisman. 
Through j ‚ Хау 1). Watson, Leah Weisma 

staff. Although pes a Persons have worked together as a unifie 
in a very real sense ‘the E chapters is indicated in the table of contents; 
members of the staff Eacl pm 15 a staff document, the product of à 

by all those Но were ene, was criticized and revised several times 
written. bers of the staff at the time the report was 


Hartung was made 
as members of the staff at least 
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Study. The Evaluation Staff was primarily concerned with 
developin g means by which the achievement of the students 
in the schools could be appraised, and the strengths and 
weaknesses of the school programs could be identified. _ 
In 1936 the first class enrolled in these new programs 
graduated from the Thirty Schools, and most of them en- 
fall. This provided an opportunity to 


tered college in the 

appraise the school programs in terms of the success of their 
graduates in college. Through the generosity of the General 
Education Board, funds were provided for this study and a 
Second division of the Evaluation Staff? was established. 
of college success appears in an- 
is devoted to the discus- 


d methods of recording 


The report of the study 
other volume. The present volume 
Sion of evaluation in the Schools an 
ànd reporting. 


SIGNIFICANCE OF THE EVALUATION PROJECT 


The term “evaluation” was used to describe the staff and 
the project rather than the term “measurement,” “test,” or 
“examination” because the term “evaluation” implies a proc- 
ess by which the values of an enterprise are ascertained. To 
help provide means by which the Thirty Schools could as- 
certain the values of their new programs was the basic pur- 
pose of the evaluation project. The project has significance 
not only for the Thirty Schools but for schools and colleges 
generally. Adequate appraisal of the educational program 
of a school or college is rarely made. Yet an appraisal of an 
educational institution is fundamentally only the process*by 
Which we find out how far the objectives of the institution 
are being realized. This seems а simple and straightforward 
task, and the efforts at evaluation of certain social institu- 
tions are not very complex. For example, in the case of a 
retail business enterprise the most commonly recognized ob- 


Dean Chamberlin, Enid Straw 


? Composed $ L. Bergstresser, 
posed es Jam ia cott, Harold Threlkeld. 


Chamberlin, Neal Drought, William Е. S 
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jectives are two: namely, the distribution of large quantities 
of goods and the making of profit from the sale of these 
goods. The methods for determining the quantities of goods 
sold and the profits are tangible and not very difficult to 
apply. Hence, the problem of evaluation is not usually con- 
sidered a perplexing one, and although the business enter- 
prise devotes a portion of its time and energy to appropriate 
accounting procedures, so as to make a periodical evaluation 
of its activities, we do not find a high degree of uncertainty 
about the methods of evaluation. 

In education, however, the problem of evaluation is more 
complex for several reasons. In the first place, since schools 
generally have not agreed upon their fundamental objec- 
tives, there is doubt as to what values schools expect to 
attain and therefore what results to look for in the process 
of evaluation. Even when the objectives of a school are 
agreed upon and stated, thev are frequently vague and 
require clarification in order to be understood. Furthermore, 
the methods of obtaining evidence about the attainment of 
some of these educational objectives are more difficult and 
less direct processes than those used in appraising a busi- 

. ness. It is easy to see how to measure the amount of profit in 
a retail store; it is not so easy to devise ways for measuring 
the educational changes taking place in “students in the 
school. Finally, the task of summarizing and interpreting the 
results of an evaluation of the school is complicated. Sum- 
maries of educational evaluation are needed for several dif- 
ferent groups, that is, for students, teachers, administrators, 
Parents, and patrons. Each of these groups may need some- 
what different information, or at least it will be necessary to 
present the data in different terms. It is easy to see, then, 
that educational evaluation requires more intensive study 
than evaluation of many other institutions. The work of the 
Evaluation Staff should help to demonstrate procedures by 
which the process of evaluation may be carried on and to 


APPRAISING STUDENT PROGRESS 7 


provide instruments and devices that may be used in evalua- 
tion or that may suggest ideas for the construction of other 


instruments. 
Mayor Purposes ОЕ EVALUATION 


In perceiving the appropriate place of evaluation in mod- 
ern education, consideration must be given to the purposes 
which a program of evaluation may serve. At present the 
purposes most commonly emphasized in schools and colleges 
are the grading of students, their grouping and promotion, 
reports to parents, and financial reports to the board of edu- 
cation or to the board of trustees. A comprehensive program 
Е evaluation should serve a broader range of purposes than 
these, 


One important purpose of evaluation is to make a periodic 


check on the effectiveness of the educational institution, and 
thus to indicate the points at which improvements in the 
program are necessary. In a business enterprise the monthly 
balance sheet serves to identify those departments in which 
profits have been low and those products which have not 
sold well. This serves as a stimulus to a re-examination and 
a revision of practices in the retail establishment. In a sim- 
ilar fashion, a periodic evaluation of the school or college, if 
comprehensively undertaken, should reveal points of strength 
which ought to be continued and points where practices 
need modification. This is helpful to all schools, not just to 


schools which are experimenting. 
A very important purpose of evaluation which is ‘fre- 


quently not recognized is to validate the hypotheses upon 
which the educational institution operates. A school, whether 
called “traditional” or “progressive,” organizes its curricu- 
lum on the basis of a plan which seems to the staff to be 
Satisfactory, but in reality not enough is yet known about 
curriculum construction to be sure that a given plan will 
work satisfactorily in a particular community. On that ac- 
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count, the curriculum of every school is based upon hypoth- 
eses, that is, the best judgments the staff can make on the 
basis of available information. In some cases these hypoth- 
eses are not valid, and the educational institution may con- 
tinue for years utilizing a poorly organized curriculum be- 
cause no careful evaluation has been made to check the 
validity of its hypotheses. For example, many high schools 
and colleges have constructed the curriculum on the hypoth- 
esis that students would develop writing habits and skills 
appropriate to all their needs if this responsibility were left 
entirely to the English classes. Careful appraisal has shown 
that this hypothesis is rarely, if ever, valid. Similarly, in a 
program of guidance the effort to care for personal and 
social maladjustments among students in a large school is 
sometimes based on the hypothesis that the provision of a 
well-trained guidance officer for the school will eliminate 
maladjustments. Systematic evaluation has generally shown 
that one officer has little effect unless a great deal of sup- 
plementary effort is devoted to educating teachers in child 
development and to revising the curriculum at those points 
where it promotes maladjustments. In the same way, many 
of our administrative policies and practices are based upon 
judgments which in a particular case may not be sound. 
Every educational institution has the responsibility of test- 
ing the major hypotheses upon which it operates and of 
adding to the fund of tested principles upon which schools 
may better operate in the future. 

А third important purpose of evaluation is to provide in- 
formation basic to effective guidance of individual students. 
Only as we appraise the student's achievement and as we 
get a comprehensive description of his growth and develop- 
ment are we in a position to give him sound guidance. This 
implies evaluation sufficiently comprehensive to appraise 
all significant aspects of the student's accomplishments. 
Merely the judgment that he is doing average work in a 
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particular course is not enough. We need fo find out more 
accurately where he is progressing and where he is having 
difficulties. 

A fourth purpose of evaluation is to provide a certain 
psychological security to the school staff, to the students, 
and to the parents. The responsibilities of an educational 
institution are broad and involve aspects which seem quite 
intangible to the casual observer. Frequently the staff be- 
comes a bit worried and is in doubt as to whether it is 
really accomplishing its major objectives. This uncertainty 
may be a good thing if it leads to a careful appraisal and 
constructive measures for improvement of the program; but 
without systematic evaluation the tendency is for the staff 
to become less secure and sometimes to retreat to activities 
which give tangible results although they may be less im- 
portant. Often we seck security through emphasizing pro- 
cedures which are extraneous and sometimes harmful to the 
best educational work of the school. Thus, high school teach- 
ers may devote an undue amount of energy to coaching for 
scholarship tests or college entrance examinations because 
the success of students on these examinations serves as a 
nce that something has been accomplished. 


tangible evide 5 
appropriate for 


However, since these examinations may be 
only a portion of the high school student body, concentra- 
tion of attention upon them may actually hinder the total 
im of the high school. For such teachers 
5 
aluation which gives a careful check on 
d 73 
s Id provide the kind of sectur- 
all aspects of the program would prc d 
ity that is necessary for their continued dp and self- 
P. 4 ы ci 2 " NS +. the 
Confidence. This need is эрсе ie * 
teachers who are developing and con uc ing c uca- 
obs] who a E * certainty of their pioneering efforts 
a. program. 156 v p ismay or resentment 
fed Pt RM. att They view disma\ 
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; with д | 
efforts to appraise their work in terms of devices appropriate 
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а comprehensive ev 
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recognize that the effectiveness of the new work can be 
fairly appraised only in terms of its objectives, which in cer- 
tain respects differ from the purposes of the older program. 
Students and parents are also subject to this feeling of in- 
security and in many cases desire some kind of tangible 
evidence that the educational program is effective. If this is 
not provided by a comprehensive plan of evaluation, then 
students and parents are likely to turn to tangible but ex- 
traneous factors for their security. 

A fifth purpose of evaluation which should be emphasized 
is to provide a sound basis for public relations. No factor is 
as important in establishing constructive and cooperative 
relations with the community as an understanding on the 
part of the community of the effectiveness of its educational 
institutions. A careful and comprehensive evaluation should 
provide evidence that can be widely publicized and used to 
inform the community about the value of the school or col- 
lege program. Many of the criticisms expressed by patrons 
and pareuts can be met and turned to constructive coopera- 
tion if concrete evidence is available regarding the accom- 
plishments of the school. 

Evaluation can contribute to these five purposes. It can 
provide a periodic check which gives direction to the con- 
tinued improvement of the program of the school; it can 
help to validate some of the important hypotheses upon 
which the program operates; it can furnish data about in- 
dividual students essential to wise guidance; it can give а 
móre satisfactory foundation: for the psychological security 
of the staff, of parents, and of students; and it can supply a 
sound basis for public relations. These purposes were basic 
to the Thirty Schools but they are also important to all 
schools. For these purposes to be achieved, however, they 
must be kept continually in mind in planning and in devel- 
oping the program of evaluation. The Evaluation Staff real- 
ized that the decision as to what is to be evaluated, the 
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techniques for appraisal, and the summary and interpreta- 
tion of results should all be worked out in terms of these 


important purposes. 
Basic ASSUMPTIONS 


In developing the program, the Evaluation Staff accepted 
certain basic assumptions. Eight of them were of particular 
importance. In the first place, it was assumed that educa- 
tion is a process which seeks to change the behavior pat- 
terns of human beings. It is obvious that we expect students 
to change in some respects as they go through an educa- 
tional program. An educated man is different from one who 


has no education, and presumably this difference is due to 
. It is also generally recognized 


about by education are modifica- 
the educated man reacts, that is, 
Generally, as a result of 
recall and to use ideas 
velop various skills, 
did not previously 
g, to modify their 
arts, and so on. It 


the educational experience 
that these changes brought 
tions in the ways in which 
changes in his ways of behaving. 
education we expect students to 
Which they did not have before, to de 
as in reading and writing, which they 
Possess, to improve their ways of thinkin 
reactions to esthetic experiences as in the 
Seems safe to sav on the basis of our present conception of 
learning, that education, when it is effective, changes the 
ehavior patterns of human beings. 
. А second basic assumption was that the kinds of changes 
m behavior patterns in human beings which the school 
Seeks to bring about are its educational objectives. The fun- 
damental purpose of an education is to effect changes in 
the béhavior of the student, that is, in the way he thinks, 
and feels, and acts. The aims of any educational program 
cannot well be stated in terms of the content of the program 
Or in terms of the methods and procedures followed by the 
teachers, for these are only means to other ends. Basically, 
the goals of education represent these changes in human 
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beings which we hope to bring about through education. 
The kinds of ideas which we expect students to get and to 
use, the kinds of skills which we hope they will develop, 
the techniques of thinking which we hope they will acquire, 
the ways in which we hope they will learn to react to 
esthetic experiences—these are illustrations of educational 
objectives. 

A third basic assumption was referred to at the opening 
of the chapter. An educational program is appraised by find- 
ing out how far the objectives of the program are actually 
being realized. Since the program seeks to bring about cer- 
tain changes in the behavior of students, and since these are 
the fundamental educational objectives, then it follows that 
an evaluation of the educational program is a process for 
finding out to what degree these changes in the students are 
actually taking place. 

The fourth basic assumption was that human behavior is 
ordinarily so complex that it cannot be adequately described 
or measured by a single term or a single dimension. Several 
aspects or dimensions are usually necessary to describe or 
measure a particular phase of human behavior. Hence, we 
did not conceive that a single score, a single category, or à 
single grade would serve to summarize the evaluation of 
any phase of the student's achievement. Rather, it was antic- 
ipated that multiple scores, categories, or descriptions would 
need to be developed. 

The fifth assumption was a companion to the fourth. It 
was assumed that the way in which the student organizes 
his behavior patterns is an important aspect to be appraised. 
There is always the danger that the identification of these 
various types of objectives will result in their treatment as 
isolated bits of behavior. Thus, the recognition that an edu- 
cational program seeks to change the student's information, 
skills, ways of thinking, attitudes, and interests, may result 
in an evaluation program which appraises the development 
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of each of these aspects of behavior separately, and makes 
no effort to relate them. We must not forget that the human 
being reacts in a fairly unified fashion; hence, in any given 
situation information is not usually separated from skills, 
from ways of thinking, or from attitudes, interests, and ap- 
preciations. For example, a student who encounters an im- 
portant social-civic problem is expected to draw upon his 
information, to use such skill as he has in locating addi- 
tional facts, to think through the problem critically, to make 
choices of courses of action in terms of fundamental values 
and attitudes, and to be continually interested in better solu- 
tions to such problems. This clearly involves the relation- 
ship of various behavior patterns and their better integra- 
tion. The way the student grows in his ability to relate his 
Various reactions is an important aspect of his development 
and an important part of any evaluation of his educational 
achievement. 

A sixth basic assumption was that the methods of evalua- 
tion are not limited to the giving of paper and pencil tests; 
any device which provides valid evidence regarding the 
Progress of students toward educational objectives is appro- 
priate, As a matter of practice, most programs of appraisal 
have been limited to written examinations or paper and 
Pencil tests of some type. Perhaps this has been due to the 
long tradition associated with written examinations or per- 
haps to the greater ease with which written examinations 
may be given and the results summarized. However, a cgn- 
Sideration of the kinds of objectives formulated for general 
clear that written examinations are not 
an adequate appraisal for all of these ob- 
be a valid measure of informa- 


education makes 
likely’ to provide 
Jectives, A written test. may 
tion recalled and ideas remembered. In many cases, too, the 
Student's skill in writing and in mathematics may be shown 
by written tests, and it is also true that various techniques 
of thinking may be evidenced through more novel types of 
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written test materials. On the other hand, evidence regard- 
ing the improvement of health practices, personal-sodil ad- 
justment, interests, and attitudes mav require a much wider 
repertoire of appraisal techniques. This assumption empha- 
sizes the wider range of techniques which may be used in 
evaluation, such as observational records, anecdotal records. 
questionnaires, interviews, check lists, records of activities, 
products made, and the like. The selection of evaluation 
techniques should be made in terms of the 
of these techniques for the kind of behavior to be appraised. 

A seventh basic assumption was that the nature of the 
appraisal influences teaching and learning. If students are 
periodically examined on certain content, the tendency will 
be for them to concentrate their study on this material, even 
though this content is given little or no emphasis in the 
course of study. Teachers, too, are frequently influenced by 
their conception of the achievement tests used. If these tests 
are thought to emphasize certain points, these points will be 
emphasized in teaching even though they are not included 
in the plan of the course. This influence of 
teaching and learning led the Evalu 
velop evaluation instruments and methods in harmony with 
the new curricula and, as far as possible, of a non-restrictive 
nature. That is, major attention was given to appraisal de- 
vices appropriate to of curriculum content and 


appropriateness 


appraisal upon 
ation Staff to try to de- 


; ct-matter tests since these 
mmon informational material in the cur- 


at the responsibility 
А ‘onged to the staff and 
clientele of the school. It was not the duty of the Evaluation 
r to help develop the 

Ods of interpretation. 
Hence, this volume does not contain a | 
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of the Thirty Schools or the results obtained by the use of 
the evaluation instruments in the schools. This volume is a 
report of the development of techniques for evaluation. 

The evaluation program utilized other assumptions but 
these eight were of particular importance because they 
guided the general procedure by which the evaluation pro- 
gram was developed. They showed the necessity for basing 
an evaluation program upon educational objectives, and they 
indicated that educational objectives for purposes of evalu- 
ation must be stated in terms of changes in behavior of stu- 
dents; they emphasized the multiple aspects of behavior 
and the importance of the relation of these various aspects 
of behavior rather than treatment of them in isolation; and 
they made clear the possibility of a wide range of evaluation 
techniques. 

GENERAL PROCEDURES IN DEVELOPING THE 

EVALUATION PROGRAM 


The general procedure followed in developing the evalu- 


ation program involved seven major steps. Since the pro- 
gram was a cooperative one, including both the Schools and 
the Evaluation Staff, it should be clear that although the 
Teport was prepared by the staff, the work was done by a 
arge number of persons. No one of the instruments devel- 
Oped is the product of a single author. All have required the 
efforts of various members of the school staffs and the Evalu- 


ation Staff. . : " 


1. Formulating Objectives 


As'the first step, each school faculty was asked to formu- 


late a statement of its educational objectives. Since the 
Schools were in the process of curriculum revision, several 


of them had already taken this step. This is not just an evalu- 
ation activity; for it is usually considered one of the impor- 
tant steps in curriculum construction. It is not necessary 
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here to point out that the selection of the educational objec- 
tives of a school and their validation require studies of sev- 
eral sorts. Valid educational objectives are not arrived at as 
a compromise among the various whims or preferences of 
individual faculty members but are reached on the basis of 
considered judgment utilizing evidence regarding the de- 
mands of society, the characteristics of students, the poten- 
tial contributions which various fields of learning may make, 
the social and educational philosophy of the school or col- 
lege, and what we know from the psychology of learning as 
to the attainability of various types of objectives. Hence, 
many of the schools spent a great deal of time on this step 
and arranged to re-examine their objectives periodically. 
2. Classification of Objectives 


As a second step, these statements of objectives from the 
Thirty Schools were combined into one comprehensive list 
and classified into major types. Before classifi sation, the 
objectives were of various levels of generality and specificity 
and too numerous for practicable treatment. Furthermore, 
it was anticipated that the classification would be useful in 
guiding further curriculum development, because if prop- 
erly made it would Suggest types of learning experiences 
likely to be useful in helping to attain the objectives. A 
classification is of particular importance for evaluation be- 
cause the types of objectives indicate the kinds of evalua- 
tion techniques essential to an adequate appraisal. The 
problem of classification is iltustrated by the following par- 


tial list of objectives formulated by one school: 


l. Acquiring information ab 
pects of nutrition 

9. Becoming familiar with depend 
formation relating to nutrition 

3. Developing the abilit 
tion problems arisin 


Out various important as- 


able sources of in- 


y to deal effectively with nutri- 
g in later life 


APPRAISING STUDENT PROGRESS 17 


Acquiring information about majornatural resources 

Becoming familiar with sources of information re- 

garding natural resources 

Acquiring the ability to utilize and to interpret maps 

Developing attitudes favoring conservation and bet- 

ter utilization of natural resources 

Becoming familiar with a range of types of literature 

Acquiring facility in interpreting literary materials 

10. Developing broad and mature reading interests 

11. Developing appreciation of literature 

12. Acquiring information about important aspects of 
our scientific world 

13. Developing understanding of some of the basic scien- 

tific concepts which help to interpret the world of 


science 

14. Improving 
from scientific data 

15. Improving ability to apply principles of science to 
problems arising in daily life 

16. Developing better personal-social adjustment 

17. Constructing a consistent philosophy of life 


л 


D 


м 


© co 


ability to draw reasonable generalizations 


These sample statements of objectives are of different 
levels of specificity and might well be grouped together 
under a smaller number of major headings. Thus, for pur- 
poses of evaluation, the several objectives having to do with 
the acquisition of information in various fields could be 
classified under one heading, „since the methods of appsais- 
ing the acquisition of information are somewhat similar in 
the yarious fields. Similarly, various objectives having to do 
with techniques of thinking, such as drawing reasonable 
inferences from data^and the application of principles to 
new problems, could be classified under the general heading 
of development of effective methods of thinking, because 
the means of appraisal for these objectives are somewhat 
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similar. Furthermore, the methods of instruction appropriate 
for these techniques of thinking have similarities even though 
the content differs widely. Eventually, the following classi- 
fication was used in general by the Staff: 


MAJOR TYPES OF OBJECTIVES 


. The development of effective methods of thinking 
. The cultivation of useful work habits and study skills 
. The inculcation of social attitudes 


. The acquisition of a wide range of significant inter- 
ests 


A Co po n 


5. The development of increased appreciation of music, 
art, literature, and other esthetic experiences 

6. The development of social sensitivity 

7. The development of better personal-social adjust- 
ment 

8. The acquisition of important information 

9. The development of physical health 

10. The development of a consistent philosophy of life 


This classification is not ideal but it served a useful pur- 
pose by focusing attention upon ten areas in which evalua- 
tion instruments were needed. Tt also helped to suggest 
emphases important in the curricular development of the 
Eight-Year Study. The classification of objectives will be im- 
proved as evidence accumulates regarding the social signifi- 
cance of different behavior patterns and regarding the cor- 
relátion and consistency among the various Specific reactions 
classified under each type of behavior. Until such research 
has been carried farther, each school or college will find 
useful some classification Which serves the two purposes 
suggested. 


E Thé appraisal of the develo i iri i 
І Е pment of pl lI it 
E technical medical training, was not worked ane “the Evaivation 
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3. Defining Objectives in Terms of Behavior ? 

The third step was to define each of these types of objec- 
tives in terms of behavior. This step is always necessary be- 
cause in any list some objectives are stated in terms so vague 
and nebulous that the kind of behavior they imply is not 
clear. Thus, a type of objective such as the development of 
effective methods of thinking may mean different things to 
different people. Only as “effective methods of thinking” is 
defined in terms of the range of reactions expected of stu- 


dents can we be sure what is to be evaluated under this 
classification. In similar fashion, such a classification as 
“useful work habits and study skills” needs to be defined by 
listing the work habits the student is expected to develop 
and the study skills which he may be expected to acquire. 

In defining each of these classes of objectives, committees 
were formed composed of representatives from the Schools 
and from the Evaluation Staff. Usually, a committee was 
formed for each major type of objective. Since each com- 
mittee included teachers from schools that had emphasized 
this type of objective, it was possible to clarify the meaning 
of the objective not in terms of a dictionary definition but 
rather in terms of descriptions of behavior teachers had in 
mind when this objective was emphasized. The committee 
procedure in defining an objective was to shuttle back and 
forth between general and specific objectives, the general 
helping to give wider implication to the specific, and the 
Specific helping to clarify the general. 

The resulting definitions will be found in subsequent 
chapters; however, à brief illustration may be appropriate 
here? The committee on the evaluation of effective methods 
of thinking identified various kinds of behavior which the 
Schools were seeking to develop as aspects of effective 
thinking. Three types of behavior patterns were considered 
Important by-all the Schools. These were: (1) the ability 
to formulate reasonable generalizations from specific data; 
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(2) the ability to apply principles to new situations; and 
(3) the ability to evaluate material purporting to be argu- 
ment, that is, to judge the logic of the argument. When the 
committee proceeded to define the kinds of data which they 
expected students to use in drawing generalizations, the 
principles which they expected students to be able to apply, 
and the kinds of situations in which they expected students 
to apply such principles, and when they had identified the 
types of arguments which they expected students to ap- 
praise critically, a clear enough definition was available to 
serve as a guide in the further development of an evaluation 
program for this class of objectives. This process of defini- 
tion had to be carried through in connection with each of 
the types of objectives for which an appraisal program was 
developed. 


4. Suggesting Situations in Which the Achievement 
of Objectives Will Be Shown 

The next problem was for each committe 
situations in which students coul 
these types of behavior so that w 
to obtain evidence regarding 
jective has been clearl 
difficult. For exam 


e to identify 
d be expected to display 
e could know where to go 
this objective. When each ob- 
y defined, this fourth step is not 
ple, one aspect of thinking defined in the 
third step was the ability to draw reasonable generalizations 
from specific data. An opportunity to exhibit such behavior 
would be provided when typical sets of data were presented 
to students and they were asked to formul 
tions which seemed reasonable to them. 
Another aspect of thinking defined in the 
the ability to apply specified principles, such 
nutrition, to specified types of problenis, sucl 
ing to diet. Hence, it seemed obvious that at 
of situations would give evidence of such 
would be a situation in which the student 


ate the generaliza- 


third step was 
as principles of 
h as those relat- 
least two kinds 
abilities. One 
was presented 


APPRAISING STUDENT PROGRESS 21 


with these problems, for example, dietary problems, and 
asked to work out solutions utilizing appropriate principles 
of nutrition. Another kind of situation would be one in which 
the students were given descriptions of certain nutritional 
conditions together with a statement regarding the diet of 
the people involved, and the students were asked to explain 
how these nutritional conditions could have come about, 
utritional principles in their explanations. 
ation, the definition of objectives identi- 
fied as one educational goal the ability to locate dependable 
information relating to specified types of problems. It seemed 
obvious that a situation which would give students a chance 
to show this ability would be one in which they were asked 
to find information relating to these specified problems. 


One value of this fourth step was to suggest a much wider 


range of situations which might be used in evaluation than 
have commonly been utilized. By the time the fourth step 
was completed, there were listed a considerable number of 
types of situations which gave students a chance to indicate 
the sort of behavior patterns they had developed. These 


i : ce, А » » 
were potential "test situations. 


using appropriate n 
As a third illustr 


5. Selecting and Trying Promising 


Evaluation Methods 
The fifth step in the evaluation procedure involved the 


Selection and trial of promising methods for obtaining evi- 
dence regarding each type of objective. Before attempting 
to ‘construct new "evaluation. instruments, each committee 
examined tests and other instruments already developed to 
see whether they would serve as satisfactory means for ap- 
praising the objective. Only limited test bibliographies were 
then available.’ In addition to examining bibliographies, the 

*Now, any group working on an evaluation program will find useful a 
more complete *bibliography of evaluation instruments, such as the Buros 


Mental Measurements Yearbook. This bibliography not only lists tests and 
other appraisal instruments which are commercially available, but also in- 


Accessioneg Monai 
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t 
committees obtained copies of those instruments which 
seemed to have some relation to their objectives. In exam- 
ining an instrument the committee members tried to judge 
whether the student taking the test could be expected to 
carry out the kind of behavior indicated in the committee’s 
definition of this objective. Then, too, the situations used in 
the instruments were compared with those suggested in the 
fourth step as to their likelihood of evoking the behavior to 
be measured. The committees recognized that they might 
be misled by undue optimism in the name or the description 
of the test, and sought to guard against it. Even though a 
test was called a general culture test, or a world history 
test, or a general mathematics test, it was generally found 
that it measured only one or two of the objectives which 
teachers of these fields considered important. In order to 
estimate what the test did measure, it was necessary to 
examine the test situations to judge what kind of reaction 
must be made by the student in seeking to answer the ques- 
tions. It also proved useful to examine any 


which helped to indicate the kind of beh 
actually measuring. 


evidence reported 
avior the test was 


At this point most of the committees found that no tests 


were available to measure certain major aspects of the impor- 
tant objectives. In such cases, it was necessary to construct 
additional new instruments in ord 


er to make a really com- 
prehensive appraisal of the educational program in the 
Thirty Schools. The nature of the instraments to be built 
varied with the types of objectives for which no available 
instruments were found. Every committee, however, found 
it helpful in constructing these instruments to set up some 
of the situations suggested in step four and actually to try 
them out with students to see how far they could be used as 
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test situations. By the time the fifth step had been carried 
able tests were selected and tried out 


through, certain avail 
al instruments were constructed and 


and certain new apprais 
given tentative trial. 
6. Developing and Improving Appraisal Methods 

The sixth major step was to select on the basis of this 
preliminary trial the more promising appraisal methods for 
and improvement. This further devel- 
largely the responsibility of 
ttees met from time to time 


further development 
opment and improvement was 


the Evaluation Staff. The commi 
to review the work of the Staff, and many teachers were 


asked to criticize and make suggestions for improvement. 
Obviously, however, the detailed work had to be done by the 
у, 


Staff, 
The basis for selecting devic 


included the degree to which 
found ive r s consistent with other evidences regard- 
to give resu ts e 


ing the student's attainment of this objective and the extent 
to which the appraisal method could be practicably used 
under the conditions prevailing in the Schools. The refine- 
ment and improvement consisted in working out directions 
which were unambiguous, modifying exercises which were 
found not to give discriminating results, eliminating exer- 
cises which were found to be almost exact duplicates of other 
exercises in terms of the type of reaction elicited from the 
Student, developing practicable and easily interpretable rec- 
ords of the student’s behavior, and making other revisions 
which gave more clear-cut measures, which provided а more 
representative and adequate sample of the student's reac- 
tion, and which improved the ease with which the instru- 
ment could be used. 

Àn important problem in the refinement and improve- 
ment of an evaluation instrument proved to be the determina- 
tion of the aspects of student behavior to be summarized 


es for further development 
the appraisal method was 
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and the decision regarding the units or terms in which each 
aspect was to be summarized. For example, consider a test 
constructed to appraise the ability of students to formulate 
reasonable generalizations from data new to them. An ob- 
vious type of test situation would be one in which sets of 
data new to the student were presented to him and he was 
asked to examine the data and to formulate generalizations 
which seemed reasonable to him. When we approach the 
question of summarizing his behavior in some form which 
provides a measurement or appraisal, we are faced with the 
problem of identifying aspects, that is, dimensions of the 
behavior to measure, and of deciding upon units of measure- 
ment to use. One aspect which is important in judging the 
value of the generalization formulated is its relevance. Gen- 
eralizations which have no relevance to the dat 
viously not satisfactory. If this aspect is to be 

there are several possible units of measure 
be used. For example, 


a are ob- 
measured, 
ment which might 
we could set up a subjective scale for 
degree of relevance and have judges apply this scale to each 
generalization, rating it at some point on this scale. Another 
unit of measurement could be used by classifying each gen- 
eralization as relevant to the data or irrelevant to the data, 
thus measuring the relevance in terms of the number of the 
student's generalizations which are classified as relevant. On 
the other hand, since students may differ markedly in the 
total number of generalizations formulated, a better unit of 
measure for the degree of relevance might be the per cent of 
the student’s generalizations which are classified as relevant. 
A second aspect which has some importance in appraising 
generalizations of this type would be the degree to which 
relevant generalizations are carefully, formulated and in- 
volve no overgeneralizations, that is, generalizations more 
E S than the data would justify, If this aspect were 
hu нт amp several possible units could 
: Yne possible unit might be the 
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judgment of the reader of the paper as ‘to the degree to 
which each generalization was carefully or incautiously for- 
mulated. This kind of unit involves a considerable degree of 
subjective judgment so that many might prefer the simple 
categorization of each relevant generalization as either going 
beyond the data or not going beyond the data. In this case, 
a unit of measurement might be the per cent of relevant 
generalizations not going beyond the data. Perhaps these 
illustrations are sufficient to show that it is always necessary 
in the development of new evaluation instruments or in the 
use of those which have been developed by others to decide 
of the behavior to be described or measured 


on the aspects 
ill be used in describing or 


and the terms or units which w 
measuring this behavior. 


7. Interpreting Results 
The seventh and final step in the procedure of evaluation 
was to devise means for interpreting and using the results 
of the various instruments of evaluation. The previous steps 
resulted in the selection or the development of a range of 
procedures which could be used periodically in appraising 
the degree to which students were acquiring the objectives 
a given school. These instruments 


considered important in А 
provided a series of scores and descriptions which served to 


Measure various aspects of the behavior patterns of the stu- 
dents. As these instruments were used, a great number of 
al summaries became available at each ap- 
ach of these scores or verbal summáries 
measured an aspect of behavior considered important and 
représented a phase of the objectives of the school. The Staff 
arability studies for certain of the in- 


then conducted comp: i 
struments so that the scores or verbal summaries could be 
al summaries previously ob- 


compared with scores Or verba 

tained; by this comparison some estimate of the degree of 

change or growth of students could be made. However, the 
o 


Scores or verb 
praisal period. E 
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meaning of these Scores became fuller through various addi- 
tional studies. 

One type of study involved the identification of scores 
typically made by students in similar classes, in similar in- 
stitutions, or with other similar characteristics. Another help- 
ful study involved a summary and analysis of the typical 
growth or changes made in these scores from year to year. 
A third type involved studies of the interrelationship of sev- 
eral scores to identify patterns. These patterns are not only 
useful when obtained among several scores dealing with the 
behavior relating to one objective, but are also useful in 
seeing more clearly the relation among the objectives. It 
was pointed out in the introductory section of this chapter 
that human behavior is to a large degree unified and that 
efforts to analyze behavior into different types of objectives 
are useful but may do some harm if the essential interrela- 
tionships of various aspects of behavior are forgotten. Tt was 
found important in this seventh step to examine the progress 
students were making toward each of the several objectives 
in order to get more clearly the pattern of development of 
each student and of the group as a whole and also to obtain 
hypotheses helpful in explaining the types of development 
taking place. Thus, for example, the evaluation results in 
one school showed that students were making marked prog- 
ress in the acquisition of specific information and were also 
shifting markedly in their attitudes toward specific social 
cie hte tine they showed high баре af 

§ their various socia] attitudes, and were 


lins h 
making little acts and principles 

l ypothesis for further 
e students were being exposed to too large an 


not being given adequate 
als, to interpret them thor- 
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course so as to provide for a smaller amount of new mate- 
rial, the introduction of more opportunities for application. 
and the emphasis upon thoroughness of interpretation and 
reorganization. This revision in the course resulted in corre- 
Sponding improvements in the pattern of student achieve- 
ment. If this revision had not resulted in corresponding 
improvements, other hypotheses which might explain the re- 
sults would have been considered. This procedure illustrates 
a useful means of interpreting the results of several evalua- 
tion instruments. It was found that each school needed 
methods for interpreting and using the results of appraisal 
50 as to improve the educational program and to guide in- 
dividual students more wisely. 

The usefulness of the evaluation program depends very 
largely upon the degree to which the results are intelligently 
interpreted and applied by the teachers and school officers. 
The Evaluation Staff, however, had some responsibility in 
developing methods for interpreting the results intelligently 
and in helping teachers and school officers to use them most 
helpfully. Hence, in addition to making these studies of the 
instruments, members of the Evaluation Staff visited a num- 
ber of the Schools and went over the results with the school 
»ossible interpretations and indicating 


staffs, suggesting y е 
methods bv which these interpretations could be more ade- 


quately verified and used. As a result of these preliminary 
Visits, certain methods of interpretation were developed. At 
this point members of the school staffs who were participat- 
ing in summer workshops were asked to try these methods 
of interpretation and to criticize them. Then, for a period of 
two years, opportunity was provided for at least опе: repre- 
Sentative from each school to spend a considerable period 
of time in the staff headquarters to gain further familiarity 
With the evaluation instruments, with their interpretation, 
and with their use. These school representatives received 


the training on the assumption that they would have oppor- 
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tunity for giving leadership to the evaluation program in 
their respective schools. As a result of this experience, the 
staff believes that a program of testing or evaluation can 
reach greater fruition when a systematic attempt is made to 
provide for the training of teachers and school officers in 
the interpretation and use of evaluation results. 


Division or LABOR IN THE EVALUATION ProcramM 


The previous description of the development of the evalu- 
ation program explained that it involved the cooperation of 
the school personnel and the Evaluation Staff. This does not 
imply that teachers, school officers, and Evaluation Staff 
members were all performing the same functions, Although 
there was some overlapping of functions, there was also a 
general plan for division of labor. One major division of 
labor was based on the principle that the school’s duty is 
to evaluate its program, while the technician’s function is to 
help develop means of evaluation, Furthermore, in follow- 
ing through the steps of evaluation, there was some division 
of duties. Every faculty member and school officer bore 
some responsibility for the formulation of the objectives of 
his school. The classification of objectives into major types 
of behavior was largely a function of the Evaluation Staff 
because the primary purpose of this classification was to 
place in the same group those objectives which involved 
similar types of student reactions, a 
ceivably involve somewhat simil 

The further definition and’ cl 
objectives was the task of an i 
posed of teachers, school officers, and members of the Eval- 
uation Staff. The staff members raise] questions and sug- 
gested directions for discussion which would help to define 
or clarify the given type of objective, but most of the defin- 


ing was done by the representatives of the schools which 
had emphasized this type of objective. 


nd which might con- 
ar techniques of appraisal. 
arification of each class of 
nterschool committee com- 
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The interschool committee also suggested situations in 
which the desired behavior might be shown by students. 
The school representatives then assumed responsibility for 
trying out these situations to see if they would serve as 
means of evaluation. The review of these trials, their criti- 
cism, and plans for improving the methods of evaluation 
were carried on by the entire committee. From this point on, 
the refining of the evaluation instrument and its develop- 
ment for constructive use was largely the task of members 
of the Evaluation Staff. However, teachers and school offi- 
cers gave helpful criticisms and suggestions and eventually 
determined whether an instrument was worth using and 
could practicaoly be used in a given school. Finally, the 
school staff was expected to assume responsibility for obtain- 
ing evidence of growth and studying these results. | 

This plan has wide applicability. It provides a way in 
which technicians in testing and evaluation may work con- 
structively with teachers and school officers to develop an 
evaluation program. It avoids the danger on the one hand 
of having instruments constructed by technicians who are 
not clear about the curriculum and guidance program of 
the school, and on the other hand the formulation of an 
evaluation program by persons who are relatively unfamiliar 
with methods of describing and measuring human behavior. 

SUMMARY 


This brief description of the steps followed in developing 
the evaluation program should have indicated that the proc- 
ess of evaluation was conceived as an integral part of the 
eduvational process. It was not thought of as simply the 
giving of a few ready-made tests and the tabulations of 
Tesulting scores. It was believed to be a recurring process 
involving the formulation of objectives, their clearer defini- 
tion, plans to study students’ reactions in the light of these 
objectives, and continued efforts to interpret the results of 
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such appraisals in terms which throw helpful light on the 
educational program and on the individual student. This 
sort of procedure goes on as a continuing cycle. Studying 
the results of evaluation often leads to a reformulation and 
improvement in the conception of the objectives to be ob- 
tained. The results of evaluation and any reformulation of 
objectives will suggest desirable modifications in teaching 
and in the educational program itself. Modifications in the 
objectives and in the educational program will result in cor- 
responding modifications in the program of evaluation. So 
the cycle goes on. 

As the evaluation committees carried on their work, it 
became clear that an evaluation program is also a potent 
method of continued teacher education. The recurring de- 
mand for the formulation and clarification of objectives, the 
continuing study of the reactions of students in terms of 
these objectives, and the persistent attempt to relate the 
results obtained from various sorts of measurement are all 
means for focusing the interests and 
upon the most vital parts of the educ 
results in several schools indic 
a means for the continued im 


efforts of teachers 
ational process. The 
ate that evaluation provides 
provement of the educational 
program, for an ever deepening understanding of students 
with a consequent increase in the effectiveness of the school. 

The subsequent chapters describe in more detail the de- 
velopment of evaluation instruments for certain types of ob- 
jectiyes. Space does not permit the descri j 
evaluation instruments developed. Tests of effective methods 
of thinking are described because this objective was of con- 
cern to all the schools, and few instruments of this sort had 
previously been developed. On the other hand, although 
work habits and study skills were emphasized in most of 
the schools, the description of the instruments developed is 
not included in this report. The committee identified the fol- 


ption of all the 
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lowing work habits and study skills for which methods of 


appraisal were needed: 


DT 


12 


1.8 


14 


15 


Range of Work Habits and Study Skills 


Effective Use of Study Time 

111 Habit of using large blocks of free time effectively 

112 Habit of budgeting his time 

113 Habit of sustained application rather than working 
sporadically 

114 Habit of mecting promptly study obligations 

115 Habit of carrying work through to completion 


Conditions for Effective Study 

121 Knowledge of proper working conditions 

1.22 Habit of providing proper working conditions for him- 
self 

193 Habit of working independently, that is, working 
under his own direction and initiative 


Effective Planning of Study 


1.31 Habit of planning in advance 

132 Habit of choosing problems for investigation which 
have significance for him 

133 Ability to define a problem 

1.84 Habit of analyzing a problem so as to sense its impli- 
cations 

135 Ability to determine data needed in an investigation 


Selection of Sources 
1.41 Awareness of kinds of information which may be ob- 


tained from various sources 
1.42 Awareness of the limitations of the various sources of 


data 
1.43 Habit of using appropriate sources of information, in- 


cluding printed materials, lectures, interviews, ob- 
servations, and so on 
Effective Use of Various Sources of Data 


1.51 Use of library 
1511 Knowledge of important library tools 


32 


1.6 


1.7 


„161 Ability to determine Whether the d 
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1.12 Ability to use the card catalogue in a library 
1.52 Use of books 
1.521 Ability to use the dictionary 
1.522 Habit of using the helps (such as the Index) 
in books 
1.593 Ability to use maps, charts and diagrams 
158 Reading 
1.531 Ability to read a variety of materials for a 
variety of purposes using a variety of read- 
ing techniques 
1.582 Power to read with discrimination 
1.533 Ability to read rapidly 
1534 Development of a more effective reading vo- 
cabulary 
154 Ability to get helpful information from other persons 
1541 Ability to understand material presented orally 
1.549 Facility in the techniques of discussion, par- 
ticularly discussions which clarify the issues 
in controversial questions 
Ability to obtain information from interviews 
with people 
Ability to obtain helpful information from ficld trips 
and other excursions 
156 Ability to obtain information from labor 
ments 


1.543 


1.55 


atory experi- 


157 Habit of obtaining needed information from observa- 


tions 
Determining Relevancy of Data 


ata found are rel- 
evant to the particular problem 

Recording and Organizing Data 

L71 Habit of takin 


£ useful notes for various purposes from 
observ, 


ations, lectures, interviews, 
1.72 Ability to outline material for vari 
1.73 Ability to make an effective org. 


material may be readily recall 


and reading 

ous purposes 
anization so that the 
ed, as in notetaking 
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Ld Bn ч H $ : 
L74 Ability to make an effective organization for written 


presentation of a topic 
175 Ability to make an effective organization for oral 


presentation of a topic 
1.76 Ability to write effective summaries 
18 Presentation of the Results of Study 
181 Ability to make an effective written presentation of 
the results of study 
1.811 Habit of differentiating quoted material from 
summarized material in writing reports 
1812 Facility in handwriting or in typing 
1.82 Ability to make an effective oral presentation of the 
results of study 
L9 Habit of Evaluating Each Step in an Investigation 
1.91 Habit of considering the dependability of the data 
obtained from various sources 
1.92 Habit of considering the relative importance of the 
various ideas obtained from various sources 
1.93 Habit of refraining from generalization until data are 
adequate 
194 Habit of testing his own generalizations 
195 Habit of criticizing his own investigations 
instruments were constructed 
and skills; Most of these 
justify inclusion in this 


A number of preliminary 
for this extensive list of habits 
have not been sufficiently refined to 


volume. 
Instruments for appraising social attitudes are treated in 


the chapter on the’ evaluation of social sensitivity. Because 
so many tests of information were already available, and 
because techniques for measuring the recall and use of in- 
formation were well understood by teachers, the committees 
did not devote major attention to developing further instru- 


E. A monograph, "Study Skills and Work Habits: Some Selected Mate- 
rials,” was prepared bv a committee headed by Cecile White Flemming 
of the Horace Mann School for Girls, and was circulated in mimeographed 
form in 1935. It is now out of print. 
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ments of this type. A few were constructed for specific pur- 
poses, but these are not reported here. 

The appraisal of the philosophy of life developed by the 
students involves the use of evidence from many of the other 
areas, such as thinking, social attitudes, interests, apprecia- 
tions, and social sensitivity. Hence, methods for evaluating 
the student’s philosophy of life are primarily methods of 
combining and interpreting the results of other measure- 
ments. Methods of interpretation are discussed in Chapter 
VII. Finally, the planning of a comprehensive evaluation 
program and the problems of recording are considered. 

It is obvious that there are other areas and other problems 
in the construction and use of evaluation instruments still 
untouched. The Evaluation Staff hopes, however, that its 
experience will be useful in guiding further endeavor so 
that ultimately schools may be able to evaluate their work 
with a high degree of comprehensiveness. 


Chapter II 
ASPECTS OF THINKING 


oe. 


CGU IUe Ie EKER EE ERE ERE GERI 
INTRODUCTION 


The responsibility of secondary schools for training citizens 
who can think clearly has been so long and so frequently 
acknowledged that it is now almost taken for granted. The 
educational objectives classifiable under the generic heading 
"clear thinking" are numerous and varied as to statement, 
but there can be little doubt concerning their fundamental 
importance. Although in recent years there has been increas- 
ing recognition of other responsibilities and purposes, there 
has been їе accompanying tendency to demote clear think- 
ing to a minor role as an educational objective. It was there- 
fore not surprising to find considerable emphasis upon this 
objective in the statements of purposes submitted to the 
Evaluation Staff by the schools participating in the Eight- 
Year Study. 

The fact that an objective has been stated frequently or 
with emphasis does not insure that its meaning and implica- 
tions are sufficiently clear to guide effective teaching or to 
Serve as a basis for the evaluation of achievement. In this 
respect the "clear thinking" objectives as originally stated by 
the schools were no different from other even more “in- 
tangible" objectives. An examination of the pertinent educa- 
tional literature, moreover, revealed that most of the available 
analyses of these objectives were unsatisfactory for the pur- 
pose of evaluation. It therefore proved necessary to devote 
considerable time to clarification of the objectives and to 
analysis of the behaviors which would reveal that students 

35 
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were achieving them. In the course of the analysis it was 
convenient to break up the general objective into a limited 
number of component parts, and then to analyze each of 
these in some detail. The aspects of clear or “critical” think- 
ing which were selected dealt with the ability to interpret 
data, with the ability to apply principles of science, of the 
social studies, and of logical reasoning in general, and finally, 
with certain abilities associated with an understanding of 
the nature of proof. This chapter will be devoted chiefly to 
the description of each of these aspects as thev were even- 
tually analyzed, and to a description of some of the evalua- 
tion instruments which were developed to evaluate the asso- 
ciated abilities. 

It may be well to note at the outset that the abilities 
involved in the aspects of thinking listed above are over- 
lapping. Although the abilities called into action in a suc- 
cessful interpretation of a set of data seem to be primarily 
inductive, and those utilized in the other aspetts are more 
deductive in nature, it is neither necessary nor desirable to 
emphasize such distinctions. In connection with any given 
problem, the process of reflective thinking, as defined by 
Dewey and others, is likely to call upon a number of the 
abilities to be described in connection with each major aspect 
of thinking mentioned above. It should also be noted that 
other important aspects of thinking—for example, the ability 
to formulate hypotheses—are only implicitly included in the 
above list and receive only cursory attention in the following 
discussion. The separation of clear thinking into these and 
other aspects is a product of the analysis and is not to be 
considered as inherent in the process of clear thinking. It was 
convenient because it facilitated the exploration of the larger 
objective and the development of practicable means of eval- 
uation. A satisfactory evaluation of the thinking abilities of 


students involves a synthesis of the data obt 


ained from vari- 
ous instruments. 
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The four major aspects of clear thinking listed above not 
only overlap among themselves, but they also overlap with 
other educational objectives. The attitudes and the emotions 
of students may influence their ability to think clearly in cer- 
tain situations. This has been explicitly recognized in the 
analyses of these objectives and in the construction of the 
evaluation instruments to be described in this chapter. At 
the moment, it is necessary to mention only that evaluation 
of the disposition to think critically has not been extensively 
worked upon and is not discussed in the following pages. In 
the opinion of the Evaluation Staff, the best available means 
is some sort of observational record, and this method de- 
mands only the simplest of techniques supported by alert 
| erseverance on the part of the observer. Evi- 
dence of the disposition to think critically collected by this 

a valuable addition to other evi- 


method would, however, be 
dence relevant to clear thinking of the sort to be described 


later, 
The scope of this pha 
necessary to omit many 


Sensitivity and p 


se of the evaluation project made it 
details in the discussion of some of 
the instruments. For purposes of illustration, certain pro- 
cedures are explained at length in relation to a selected in- 
Strument, and are condensed or omitted elsewhere. The 
analysis of the application of principles in the field of social 
Science is treated somewhat differently from that for the 
ill consequently be found in Chapter 
III on “Social Sensitivity." The following sections include 
the analyses which were made of the ability to interpret data, 
of е application of principles of science and of logical 
reasoning, and of abilities associated with an understanding 
of the nature of proof. The instruments to measure achieve- 
and some of their technical char- 
e described. No account is 
loped by individual teachers. 


natural sciences, and w 


ment that were developed 
acteristics and uses will also b 
given of similar instruments deve 
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I. INTERPRETATION Or Data 
ANALYSIS OF THE OBJECTIVE 


The Committee on the Interpretation of Data, composed of 
representatives from each school interested in this objective 
and members of the Evaluation Staff, began with two major 
questions: What do students do when they interpret data 
well? What kinds of data should they be able to interpret? 
Behaviors Involved in Interpretation of Data 

Some conceived of interpretation as a complex behavior 
which included the ability to judge the accuracy and rele- 
vance of data, to perceive relationships in data, to recognize 
the limitations of data, and to formulate hypotheses on the 
basis of data. From the wide range of behaviors which were 
suggested, the committee selected two which seemed to them 
to be of paramount importance: (1) 
relationships in data 
limitations of data. 


The first of these involves the ability to make comparisons, 
to see elements common to several items of the data, and to 
recognize prevailing tendencies or trends in the data. These 
behaviors are dependent on the ability 
to make simple computations 
bolism used. It became 


the ability to perceive 
, and (2) the ability to recognize the 


. Specific points on 
the graph, relate these to the base lines, recognize variations 


in length of bars or slope of graph line, and so on. In many 
cases, students must understand simple statistical terms (e.g. 
average"), the units used, and the conventional methods of 
presentation of different forms of data, 
A second type of behavior which the teachers expect of 
students is the ability to recognize the limitations of given 
data even when the items are assumed to be dependable. A 


APPRAISING STUDENT PROGRESS 39 


student who develops this ability recognizés what other in- 
formation, in addition to that given, he must have in order 
to be reasonably sure of certain types of interpretations. He 
refrains from making judgments relative to implied causes, 
effects, or purposés until he has necessary facts at hand. He 
recognizes the error in allowing his emotions to carry him 
beyond the given facts when he judges conclusions that 
affect him personally. If he holds rigidly to what is estab- 
lished by the data, the kinds of generalizations that he can 
make without qualifications are limited. He recognizes that 
many interpretations must be regarded as almost completely 
uncertain because the facts given are insufficient to support 
such interpretations even with appropriately stated quali- 
fications. : 
These behaviors do not preclude the possibility of making 
qualified inferences when the situation warrants. This tvpe 
of interpretation can be made, for example, when the data 
reveal definite trends. By qualifying the statement with 
Words such as “probably” a student may then extrapolate, 
that is, make interpretations which are somewhat beyond 


the facts but in agreement with a definitely established trend. 
x a other words, make a quali- 


Or a student may interpolate, ix 4 ганч b F 
" - i int between observe 
fied inference concerning an omitted po у 
1 an established trend. In 


Points in a set of data which revea du 
another case, a student may risk a qualified prediction rela- 
tive to similar sets of data applying to similar conditions. 
Even when the inferences are qualified, the student must be 
careful not to allow his statements to go far beyond the 'ob- 
Served facts. These inferences are necessarily confined to a 
rather narrow range whose extent depends somewhat on the 
Subject to which the data apply. Fundamentally, the objec- 
tive involves making a distinction between what is estab- 
lished by the data alone; and what is being read into the data 


Y the interpreter. 


During the analysis of the objective it was also recognized 
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that the ability to make original interpretations and the 
ability to judge criticallv interpretations made bv others 
might not be closely related. When judging a stated inter- 
pretation one may derive a clue that directs attention to 
specific relationships in the data. An original interpretation 
usually involves the ability to perceive these relationships 
without the aid of suggestions or directions. In the discus- 
sion of this point it was noted, on the one hand, that rela- 
tively few individuals have occasion to collect data and make 
original interpretations, since most of the data encountered 
in life are already wholly or partially interpreted. Critical 
judgment of these interpretations is, however, very impor- 
tant. On the other hand, it was noted that some individuals 
do have frequent need to collect data and formulate original 
interpretations, and almost everyone has some need of the 
abilities involved. A decision was made to concentrate pri- 
marily upon evaluation of the ability to judge interpretations 
made by others, and to study the relationship between this 
and the ability to make original interpretations, 

Several other behaviors were recognized as ones which 
may be considered important in connection with the inter- 
pretation of data. One of these is the ability to evaluate the 
dependability of data; another is the ability to formulate 
hypotheses. In evaluating the dependability of data, a stu- 
dent might question the competence, bias, or integrity of the 
person who presents the data; he might attempt to determine 
the adequacy and appropriateness of the methods, tech- 
niques, and controls used in obtaining the data; he might 
question the adequacy and the appropriateness of the 
methods of Summarizing the data, In formulating hypotheses 
on the basis of given data, the student might infer probable 
causes or he might predict Probable effects. Information 
other than that given in the data may be required in order 
to make a satisfactory evaluation or to formulate a reasonable 
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hypothesis. Thus recall of information might also be re- 
garded as an ability involved in the interpretation of data. 
Although the importance of all these aspects of interpre- 
tation of data was fully recognized, the teachers selected for 
More intensive study those behaviors on which they proposed 
to give the greatest emphasis in their respective schools. 
Whether a student is making original interpretations or judg- 
ing interpretations made by others, the teachers expect the 
student who has achieved the objective to perceive relation- 
ships in data and to recognize the limitations of data. These 
two important behaviors were therefore selected for par- 
ticular attention in developing evaluation instruments. 


Kind of Data 
The second major question which had to be answered in 
analyzing the objective dealt with the kinds of data that 
Students should be able to interpret. The committee recog- 
nized several different ways of classifying data. Among these 
Were the following: (1) according to the form of presenta- 
tion, (2) according to the subject-matter fields from which 
the data are drawn, (3) according to problems or areas of 
living with which the data deal, (4) according to types of 
relationships inherent in the data, (5) according to the pur- 
Pose the data are intended to serve, (6) according to various 
levels of generality, (7) according to the degree to which the 
Possibility of making meaningful interpretations depends 
Upon the knowledge of other facts. 
The form of presentation of data may vary. F or example, 
ata may be presented in graphical form. Pictures, maps, 
Cartons, and various types of graphs, such as line or bar 
8taphs, are familiar examples. Data also are often presented 
in tabular form. Such tables are frequently found in reports 
of experiments, election returns, scores of baseball games, 
and so on, Sometimes data are not set off from the prose 
orm of reading matter but are incorporated in the context. 
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This method or presentation is often used in editorials, 
printed speeches, and news items. Sometimes the same 
are presented in several forms; this situation is commonly 
found in advertisements, for example. 

Data may be drawn from various subject fields. Data from 
the fields of economics and sociology commonly appear in 
newspapers, magazines, and current books. Data from the 
fields of physics, chemistry, biology, and other sciences are 
presented in many publications which are commonly read; 
advertisements, for example, often incorporate data from 
these fields. 

The classification of data in terms of areas of living or 
problems would probably make use of categories such as 
vocation, health, government, transportation, family relation- 
ships, and others of similar type. Classification according to 
types of relationship would emphasize categories such as 
chronological trends, relationship of parts to a whole, and 
the like. If data are differentiated in terms of the purposes 
which they are intended to serve, distinctions may be made, 
for example, between what purports to be an impartial 
presentation of facts and a presentation intended to sell а 
particular idea or defend a special interest. Different levels 
of generality are illustrated by data showing unemployment 
in a single city in contrast to data on unem 
entire state or country. If the latter are avail 


meaningful interpretations could be made 
situation in the single city 


ployment in an 
able, often more 
concerning the 


‚ and hence this same illustration 
indicates how additional information may influence the in- 


terpretation, and how the amount of such information needed 
may form a basis of classification. 

Although other classifications are possible and were con- 
sidered, for purposes of evaluation the teachers chose the 
following criteria for the selection of the data to be presented 
to students for interpretation: (1) data presented in various 
forms; (2) data relating to various subject fields; (8) data 
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relating to major problem areas; (4) data including various 
types of relationships. As is often the case, these criteria are 
not independent, and a given set of data will satisfy several 
criteria simultaneously. 

In order that the interpretation may not be made from 
memory, it is necessary that the data be “new” to the student 
in the sense that this particular organization of the facts has 
not previously been interpreted for the student bv someone 
else. If he has heard or read an interpretation of this or- 
ganization of facts, his response may represent recall of an 
interpretation made by another and not give a measure of 
his own ability to interpret. 

The analysis of the objec 
description of the behaviors whic 
the phrase “interpretation of data, 
restriction of the scope of the eventu 
Striction applied to the types of beha 
emphasized, and to the criteria for 
which were to be presented to students. 


tive thus resulted not only in a 
h might be included under 
" but also in a conscious 
al evaluation. This re- 
vior which were to be 
the selection of data 


Tur DEVELOPMENT OF EVALUATION INSTRUMENTS 


Preliminary Investigations 

Observations of a student's many overt behaviors in re- 
sponding to data of various kinds is one way in which evi- 
dence of his ability to interpret data may be obtained. This 
type of evidence can probably be best secured by observa- 
tional records kept by teachers or other persons trained to 
Observe and record these behaviors. Under certain condi- 
tions a student's written materials, such as laboratorv note- 
books, papers, etc., may bea fruitful source of evidence. 
However, the time consumed and the possible lack of ob- 
jectivity of scores present serious difficulties in the use of 
these techniques. Since these methods usually involved more 
Or less uncontrolled situations, teachers were interested in 
devising a method that would better stabilize some of the 
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variable factors. The method which was selected makes use 
of pencil and paper tests in which the student reacts in writ- 
ing to written data. Many methods of obtaining this type of 
evidence have been experimented with in the Study. A few 
will be discussed to present some of the approaches used 
and some of the difficulties the Evaluation Staff has encoun- 
tered in measuring the abilities involved. One of the most 
direct methods used was to present the student with sets of 
written data, ask him to write true statements concerning 
tho data, and to appraise the interpretations which he wrote. 
However, such a free-response essay-form presents several 
difficulties in evaluation. It was found that even when the 
number of interpretations to be made is specified in the di- 
rections, individual students tend to use a narrow range of 
relationships in their responses. Thus, the responses do not 
adequately sample the types of interpretations which the 
students are capable of making when their attention is fo- 
cussed on data relating to their own particular problems or 
concerns, or when breadth of treatment is encouraged by 
more specific directions in the test. Moreover, great difficulty 
is experienced in scoring such a test, for it is often impossible 
to be reasonably sure what the student means by his written 
statements. This perplexity may 
completeness of student’s statem 
his style. It is possible to 


arise from ambiguity or in- 
ents or from peculiarities in 
attain high objectivity for such a 
test, but only after elaborate criteria for scoring have been 
carefully set up. Even with such a device, it is a time-con- 
suming method. In one case, for example, it required ap- 
proximately 90 hours for each of the trained markers to score 
193 papers of ten exercises calling for responses of this type: 


Because of these difficulties, this method of getting evidence 


of a student’s ability to interpret data is impractical for most 
teachers. 


In order to determine the types of interpretations students 
should be expected to ju 


dge critically and the kinds of errors 
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commonly made in interpreting data, a study was made of 
interpretations commonly found in editorials, advertisements, 
news items, reports of scientific experiments, and similar 
materials. For instance, the conclusions of many reports of 
experiments were critically studied in relation to the data on 
which they were based. In this and other such studies it was 
possible to discover the kinds of relationships involved in 
the interpretations, the kind of assumptions that were made, 
the accuracy and adequacy of the inferences made from the 
data. When students’ essay responses were also critically 
studied in the same way and comparisons made, it became 
apparent that the interpretations from both these sources 
Were susceptible to virtually the same types of classifications. 
One classification that could be made was in terms of the 


kind of relationships involved. For convenience of reference, 

these types are denoted by various words or phrases, such as 

» T y = n » “ » 
extrapolation," "comparison of points," or "cause. They are 


as follows:! 
l. Reading Points. This type of statement is usually 
© 


merely a restatement of the data. н " 
; is a comparison 
9, С ison of Points. The statemen 
faic cma f? . “points” in the data. 
of two or more items or р diti 
8. Cause. The statement presents a cause of conditions 


resented in the data. m 
4. Effect. The statement formulates a prediction of a 
probable effect of the conditions described. | 
5. Value Judgment. The statement Ваа recom- 
mended course of action suggested by the data, or an 
ооф и be or ought not to be. 
opinion of what ought to be or oug 
я he statement describes a pre- 
6. Recognition of Trend. The ste 
vailing tendency Ог trend in the data. 
Y. G 5 Senn of Trends. The statement presents a 
| oa ed ments of these types, see the sample problem on 
atel 


р * For examples of s 
age 52. 


10. 


11. 
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comparison of two or more prevailing tendencies or 
trends in the data. 


. Extrapolation. The statement formulates a prediction 


of a point or item or fact which is not given in the 
data and lies beyond points or items or facts which 
are given in the data. 

Interpolation. The statement formulates a prediction 
of a point or item or fact of data which lies between 
points or facts which are given in the data. 
Sampling. The statements concern (a) only a part of 
the group described in the data, or (b) a larger group 
containing as a part of itself the group described in 
the data. 

Purpose. The statement presents a judgment of pur- 
pose of the given data. 


These types of interpretations may be also arranged into 
a concise and meaningful classification which emphasizes the 
difference in degree of accuracy with which they are used 
by students. Thus, students’ responses may include the fol- 
lowing: 


1. 


о 


Interpretations which are accurate. These interpreta- 
tions may formulate comparisons, trends, and specific 
facts which are established by the data as true or 
false and are correctly stated without qualification. 
Other interpretations under this classification may be 
concemed with sampling, extrapolation, or interpola- 
tion. They are not fully supported by the given data, 
but are probably true or probably false on the basis 
of the trends established in the data, and are stated 


by the student with sufficient qualification. 


Interpretations which are overgeneralizations—that 


I5, Interpretations containing unqualified or unwar- 
а statements involving interpolation, extrapola- 
ion, and sampling, or statements of cause, purpose, 
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effect which cannot be established by the given data 
even in qualified form. This type of error may be re- 
ferred to as “going beyond the data.” 

3. Interpretations which are undergeneralizations—that 
is, which involve unnecessarily qualified statements 
concerning specific facts, trends, and comparisons 
which are established in the data. Sach departures 
from accuracy may be referred to as “overcaution. 

4. Interpretations which involve “crude errors’; for ex- 
ample, the student errs by misreading the points or 
trends in the data, by failing to understand meanings 
of terms, such as “average” and “per cent,” or by 
failing to relate properly the data of a graph to the 
base lines. 

Such analyses provided a basis for construction of a short- 
answer type of test exercise. This type of test does not pre- 
sent the difficulties in scoring inherent in the essay form and 
makes it possible for a student to react to many types of data 
in a limited time. During it5 development, the a cn 
test has passed through several transitional forms. pe a 
and statistical study of early forms suggested changes w 8 
Were incorporated ‘in subsequent forms. For the sake Ш sim- 
T Ў . Yatest form of the interpreta- 
Plicity of explanation, only the lates ipia 
tion of data test (Form 2.52) will be described in detail. 
Data Test, Form 2.52 
s intended primarily for the 
ains ten sets of data selected 


Structure of Interprctation of 
The test to be described i 


senior high school level. It cont f : 
to SEIS he criteria set up by the committee interested in 


the objective These data are presented in various forms, 
including tables, prose; charts and different kinds of graphs. 
pss al fields (such as medi- 


The problems are selected from sever 


ei Overcaution is not considere 1 
a idence of a tendency to suspen 
vailable, i 


everyone. Some consider it 


n error by й 0 1 
til further evidence is 


d a 
judgment un 
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cine, home economics, sociology, genetics) and contain data 
pertinent to such topics as technological unemployment, 
heredity, crop rotation, immigration, government expendi- 
tures, and health. 

Each set of data is followed by 15 statements which pur- 
port to be interpretations. The student is asked to indicate 
his judgment of each of the statements by placing it in one 
of five categories as indicated by the short code given at the 
top of the sample exercise on page 52. In the sample, the 
list of responses accepted as correct by a jury of competent 
persons is given in the margin before each interpretation. A 
word or phrase describing the main kind of relationship in- 
volved follows each interpretation. 

A study of the sample exercise in relation to the following 
summary of the procedure used in constructing the test will 
indicate how the analyses described previously were utilized. 
It may also serve as a guide for teachers who wish to con- 
struct similar tests suited for use with their own students. 


1. The data were selected according to the criteria set 
up by the committee. 
Fifteen interpretative statements were made from 
each set of data. The types of statements included 
were based on an analysis of types of interpretations 
which were found in current literature, the judgment 
of teachers who were concerned with the objective, 
and the analysis of responses of students who were 
n asked to write original interpretations. This 
was used both to give the students an opportunity to 
judge statements including typical errors made in 
interpretations, and to insure the inclusion in the test 
of types of interpretations which students encounter 
and are capable of recognizing. These interpretations 
involve the following types of behaviors: comparisons 
of points of data, recognition and comparison of 


2. 


approach 
appr oacl 
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trends, judgments of cause, effect, purpose, value, 
analogy," extrapolation, interpolation, and sampling. 
3. The types of relationship involved in the interpreta- 
tions which the students are asked to judge were dis- 


tributed among the five response categories as fol- 


lows: 

a. Interpretations adequately supported by the data, 
and so worded that they are meant to be judged 
by the students as true. These statements require 
the student to judge interpretations that involve: 
comparison of points in the data; recognition of 
trends; and comparison of trends. Ten per cent of 
the total number of statements in the test are in 


this category.” 
gor) 


b. Interpretations inadequately supported by the 


data, so worded that they are meant to be judged 
probably true. These statements require the stu- 
dents to judge interpretations that involve a 
knowledge of the principles of prudent = 
tion, interpolation, and sampling as ре у бе 
fined. They include inferences that go eyon the 
data but are suggested by the data and are based 
on trends or facts in the data. They also include 
some conclusions that would be popularly inter- 
preted as true. They are intended to contribute 


acerning the ability of students to 


information CO! е oy irs 
the necessity for qualification in inter- 


recognize : 

otic About 20 per cent of the total number of 
are in this category. 

statements à "T by "A 


s inadequately suppo 
d to any great 
З Alt} > E analogy was not found to be used. y grea 
extghlthough in this Study Ara eons of dat, this ter Tt vas 
encountered extensively in advertising, newspsPs i is бё aspect o 
S 50 the thought of the Evaluation Staff that ana ogy al differ P 
Scientific eae o hich they esired to measure T p j^ 1 different con- 
texts, Tt a mg isi s р saati f Principles of | isa e. 
*"lhis ү oe ols Lio based upon studies of reliabilities of early forms. 


c. Interpretation 


5° 
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data, so worded that they are meant to be judged 
as based upon insufficient data. They give oppor- 
tunity for the student to make judgments concern- 
ing statements of analogies relating to the data, 
concerning statements referring to a cause or an 
effect of the situation revealed by the data, con- 
cerning the purpose the data are supposed to 
Serve, and concerning a recommended course of 
action supposedly desirable on the basis of the 
data. Also included are some statements depend- 
ing upon an injudicious use of interpolation, extra- 
polation, and sampling. About 40 per cent of the 
total number of statements are in this category. 
d. Interpretations inadequately supported by the 
data, so worded that they are meant to be judged 
probably false. These include inferences which are 
suggested by the data but which are contrary to the 
trends of facts in the data, and conclusions which 
would be popularly interpreted as false. The same 
types of interpretations are used here as in b. 
Twenty per cent of the total number of statements 
are in this category. 
€. Interpretations which are contradicted by the 
data, so worded that they are meant to be judged 
as false. These statements involve the same types 
of interpretations as are listed in a above. Ten per 
d cent of the total number of statements are in this 
category. 3 
4. Within each test exercise the interpretations were ar- 
ranged in random order, Directions to the students 
were formulated. These directions asked students to 


place each statement in one of the five different 
categories. 


Before the test was considered ready for use, an analysis 
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of student responses was made. In each case where the judg- 
ment of a large number of students conflicted with the key, 
there was an attempt to analyze the student's thinking to see 
if the conflict in judgment was due to confusion in the test 
or to an erroneous concept held by the students. Ambiguous 
statements were revised, and a final key was drawn up. The 
scores made by students are, therefore, to be considered as 
a means of comparison of their thinking with the judgments 
of the jury. 


Summarization of Scores 

For purposes of exposition, the manner in which the an- 
swer sheets from a class are scored may be described as fol- 
lows. By tabulating a student’s response for each item in 
relation to the jury's key for that item in the proper cell of 
the following chart, a teacher can describe student’s achieve- 
ment both as to accuracy and as to errors.” 

As indicated by the chart, student responses can be de- 
scribed in the following terms: general accuracy, caution, 
beyond data, and crude errors. This terminology may be de- 
fined as follows: General accuracy means the extent to which 
es with the jury—that is, recognizes true 
statements as true, probably true as probably true, etc. The 
total number of statements which a student judged accu- 
rately may be found by counting all of the tally marks in the 
cells labeled a, g, m, s, and y. This number may be expressed 
as a per cent of the maximum possible number of correct re- 
a 


Sponses (150). 

Since the judgment of the accur 
volvés different levels of discrimination, depending on 
whether or not the interpretation needs to be qualified, it 
was found helpful to derive the following subscores on ac- 
curacy: (a) accuracy with probably true and probably false 
be done on the electric scoring machine, 
f punched key stencils. 


the student agre 


acy of the statements in- 


: “In practice, the scoring may 
Т if one is not available, by use © 


SAMPLE EXERCISE FROM FORM 2.59, 


are sufficient to make the statement true. 
are sufficient to indicate that the statement is 
probably true. 
These (8) are not sufficient to indicate whether there is any 
Data degree of truth or falsity in the statement. 
Alone (4) are sufficient to indicate that the statement is 
probably false. 
(5) are sufficient to make the statement false. 


1) 
2) 


PROBLEM I. This chart shows production, population, and em- 
ployment on farms in the United States for each 
fifth year between 1900 and 1925. 


Volume of Farm 
Production 


Farm Population of 
Employable Age 


=e e E Number of Farm 
100 pm cM GES eae, 


=, Workers Employed 
1900 1905 1910 1915 1920 1925 


Per Cent Relative to the Year 1900 


Statements 


l. The ratio of agricultural production to the number of 


farm Workers increased every five years between 1900 
and 1925. Р 


. The increase in agricul 
and 1925 was due to 
chinery. 


bo 


tural production between 1910 
more widespread use of farm ma- 


18. 


14. 


15. 


. During the entire period betw 


. Wages paid farm wor 


. More workers were employed on 


. Between 1900 and 19 
. Farmers increased production 


. The average amount О 


. The average number of farm workers employed during 


the period 1920 to 1925 was higher than during the 
period 1915 to 1920. 


4. The government should give relief to farm workers who 


are unemployed. 


. Between 1900 and 1925, the amount of fruit produced on 


farms in the United States increased about fifty per cent. 
гееп 1905 and 1925 there 
was an excess of farm population of employable age over 


the number of people needed to operate farms. 
kers in 1925 were low because there 


were more laborers than could be employed. 
farms in 1925 than in 


1900. 


. Since 1900, there has been an increase in production per 


worker in manufacturing similar to the increase in agri- 


culture. 
25, the volume of farm production 


increased over fifty per cent. 
after 1910 in order to take 


advantage of rapidly rising prices. 

f farm production was higher in 
the period 1925 to 1930 than in the period 1920 to 1925. 
Between 1900 and 1925, there was an increase in the 
farm population of employable age in the Middle West, 
the largest farming area in the United States. 

Farm population of employable age was lower in 1930 
than in 1900. 

The production of 
the United States, 


о a 
wheat, the largest agricultural crop in 
was as great in 1915 as in 1925. 
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CHART SHOWING HOW SCORES ARE DERIVED 
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oe (b) accuracy with insufficient data statements, 
and (c) accuracy with true and false statements, They indi- 
cat? the extent to which the student agrées with the jury in 


judging these three types of statements taken separately. 


Ve is derived from the number of 
Sed as a per cent of 61). The third 
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subscore is derived from the number of tallies in cells a and 
y (expressed as a per cent of 30). 

The going beyond the data score indicates the extent to 
which the student marks statements keyed probably true as 
true, statements keyed insufficient data as probably true or 
probably false, and statements keyed probably false as false. 
The student is then granting the interpretation greater cer- 
tainty than is warranted by the data. 

In order to determine how frequently a student has “gone 
beyond the data,” one may count the tallies in the cells 
labeled b, c, h, r, w, x. There are 120 opportunities for the 
Student to react in this way, and the per cent of such re- 
Sponses may easily be calculated. 

The caution score indicates the extent to which the student 
marks statements keyed true as probably true, statements 
keyed probably true as based upon insufficient data, state- 
ments keyed false as probably false, and statements keyed 


probably false as based upon insufficient data. The student 
attribute to the interpretations as much 


is then refusing to 
certainty as the jury was willing to do. 

The crude errors score indicates the extent to which the 
student marks true or probably true statements as false or 
probably false, or marks false or probably false statements as 
true or probably truc. This type of error is often due to care- 
lessness in reading the data or interpretations, or to a mis- 
understanding of some terms involved in the data. Both of 
the last two scores may be computed in the manner pre- 
scribed for previous scores. 

Omissions are scored in order to determine the actual 
number of opportunities the student had to score in other 


columns. 
A form of data shéet on which scores from this test are 


conveniently summarized is presented on page 57. The 
Scores made,by seven students in the twelfth grade were 
Selected for purposes of illustration. At the bottom of the 


a 
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her class. In the case of the student called Homer, the pat- 
tern of scores indicates that he recognized the limits of the 
given data with an accuracy about equal to the average for 
his class. When he failed to judge accurately the limitations 
of the given data, Homer was overcautious in more judg- 
ments and went beyond the data in fewer judgments than 
was average for his class. 

The second question that the test scores should answer is: 
How accurately does the student perceive various types of 
relationships in the data? 

By examining the scores in columns 2, 3, 4, and 8, some 
tentative answers to this question may be obtained. As stated 
above, the score in column 1 gives the per cent of accuracy 
with which the student is able to judge limitations of inter- 
pretations dealing with all of the types of relationships in the 
test. Scores in columns 2, 3, and 4 are subscores of the gen- 
eral accuracy score. Each subscore refers to the accuracy 
with which the student judges certain of the relationships in- 
volved in the interpretation. For example, column 2 gives 
the per cent of accuracy of a student in recognizing those 
statements which are probably true or probably false. A high 
score here indicates that the student persistently applies with 
success the principles of prudent extrapolation, interpola- 
tion, and sampling. Column 3 gives the per cent of accuracy 
in judging statements which cannot be justified without the 
use of information from other 
clude relationships such as cause, effect, purpose, analogy, 
as Well as some statements of extrapolation, interpolation, 
and sampling. Column 4 gives the per cent of accuracy of 
a student in recognizing those statements which are true or 
false. A high score indicates that the student is able to judge 
accurately statements that involve comparisons of points in 
the data, and recognition or comparison of trends. The per 
cent of crude errors (column 8) indicates errors in which the 
student marked interpretations true that the jury considered 


sources. These statements in- 
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false or probably false, and vice versa. Such errors may be 
due to vocabulary or reading difficulties, carelessness, or in- 
ability to identify the relationship involved. 

The following examples may help to clarify this explana- 
tion. Peggy's score in column 2 indicates that she stands low 
in relation to her group in the ability to make the finer dis- 
criminations necessary to judge accurately those extrapola- 
tion, interpolation, and sampling statements which are based 
on trends in the data. She is relatively poor in the accuracy 
with which she judges statements based on insufficient evi- 
dence, cause, effect. or purpose, as well as those extrapola- 
tion, interpolation, and sampling items that fall in this cate- 
gorv. The score on accuracy with true and false statements 
(column 4) seems to indicate an ability approaching the 
ass in recognizing trends and comparisons 
the data. However, this can be deter- 
ntire pattern of scores. In 
‚ to “go beyond the data,” 
the higher scor be a result of her tendency 
to be “gullible” and to mark many statements as true OF false. 

Homer’s scores in columns 2, 3, and 4 seem to indicate a 
greater accuracy in his judgment of statements based on in- 
7 assified in the other 


sufficient data than on the statements cl E 
two categories. However, it is necessary again to cons! er 


the entire pattern of scores to make a justifiable inference. 


Js velati i r caution and low score on 
Homer's relatively high score ОП ^ ; -o 
beyond data imply that he tends to refuse to make Ju gments 


8 Intercorrelations have been computed to investigate the extent to Which 
scores described above are statistically independent. See Appendix. Al- 
though positive correlation exists between each of the subscores on general 
accuracy, the intercorrelation is not sufficiently high to permit the predic- 
tion of ‘one score from another. However, à high negative correlation exists 
between the score on beyond data and insufficient data, and between gen- 


eral accuracy and crude errors. From a statistical standpoint it is possible 
in both these cases to predict one of these scores from the other without 
appreciable loss. of information about the student, but teachers find it less 
difficult to interpret the individual scores when all these scores are 


retained. 


average for her cl 
of trends or of points in 
mined onlv after studying the e 
view of Peggy's evident tendency 
e in column 4 may 
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of probability and classifies statements that are not well justi- 
fied by the data as of the insufficient data type. 

An examination of scores made by Joseph and Andrew 
shows that, although both boys receive the same score in 
general accuracy, for those judgments in which thev fail to 
be accurate Andrew tends to go beyond the data more often 
than Joseph. 

It is usually inadvisable to interpret scores on this test in 
terms of national norms, since opportunities to develop these 
abilities vary markedly from group to group. Data on means 
and standard deviations for certain groups are given in tables 
in the Appendix. If a group is known to be comparable to 


these groups, these statistics may be helpful as a background 
of comparison. 


During the period of the Eight-Year Study a number of 
instruments were developed for exploration of the ability to 


ne of these were useful in 


me forms of the test, 
t the particular needs 
he discussion that fol- 
anges that have taken 
place in the test and the reasons for them. 

One of the earliest tests that explored certain aspects of 
this objective was constructéd to measure “the ability to 
infer." One short-answer form of this test required the stu- 
dents to judge the best of five given inferences, A study of 
the responses on this test and a corresponding essay form 
yielded many clues concerning the types of inferences that 
students make. A higher validity coefficient was secured 


°Ң. W. Tyler, “Measuri ili e И 
Bulletin, IX (Nov. 19, 1930). ay ‘0 infer,” Educational Research 
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when the students were required to judge both best and 
worst inferences than when they judged only the best. 

Results of exploratory tests using a three-response form 
and others using a five-response form yielded valuable in- 
formation concerning the objective. In one of the earliest of 
the five-response forms of the test, the student was presented 
with different types of data and asked to judge interpreta- 
tions made from them. The directions were as follows: 


Consider carefully each of the following statements, and indi- 
cate in the columns to the right whether you believe: 


the data alone justify the statement. 

the data alone do not justify the statement. 

the data together with your information suggest that the 

statement is probably true. 

4, the data together with your information suggest that the 
statement is probably false. 

5. the data together with your information are insufficient to 

make a decision concerning the statement. 


PPr 


This form was used in an attempt to get evidence of two 
kinds of behavior in interpretation of data, namely, ability to 
adhere rigidly to the data and reject interpretations that go 
beyond or are contradicted by the data; and the ability to 
draw meaningful inferences from those interpretations which 
go beyond the data but which appear highly probable or 
improbable in the light of other information known to stu- 
dents. Difficulty was encountered in interpreting these scores, 
since there was no’ way of setting up controls or standards 
for judging the aniount or quality of outside information a 
student was using in judging the inferences presented. As 
will be recalled, the definition of the objective accepted by 
the committee emphasizes the ability of the students to 
recognize what the given data reveal, and to distinguish ac- 
ceptable inferences from those that cannot be justified with- 
out using information or principles from other sources. This 
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restriction led to a reformulation of the directions, and there- 
after they remained virtually the same in subsequent forms 
of the test. . 

Teachers of several subject fields were interested in this 
objective. To meet their request some of the first forms in 
which the revised directions were used restricted the field 
from which the data were drawn to the natural sciences or 
the social sciences.” Since it was believed that the behaviors 
involved in these forms are not essentially different, it was 
deemed advisable to reduce the time required in measuring 
this objective by measuring in one instrument the achieve- 
ment relative to several fields. Thus subsequent forms in- 
cluded in the same booklet data drawn from both fields." 
Statistical considerations (e.g., studies of reliability) indicate 
that this has not changed the homogeneity of the behavior to 
any great extent. 

The summarization of scores has remained, with one ex- 
ception, very much as it is found on the sample data sheet 
given above for Form 2.52. In early forms (2.2, 9.3, 2.4) the 
beyond data scores had subscores which indicated the tend- 
ency of the student to go beyond the data in the direction of 
greater truth or in the direction of greater falsity than the 
data warranted. From an analysis of responses it was found 
that in general most students showed much greater tendency 
to go beyond the data in the direction of judging the printed 
statement as true than in judging it as false. Because of this 
fact these subscores on "going beyond, the data" did not 
greatly aid the interpretation of Scores 
from subsequent forms of 
found to be more meanin 
students was added. 


the test, A caution score that was 
gful in describing the behavior of 


10 Form 2.2, Interpretation of Data (N 
Interpretation of Data (Social Scie: 


atural Sciences) and Form 2.3, 
n Forms 2.4, 2.5, 2.51, 2 


nces). 
52, Interpretation of Data, 
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gested that greater reliability of certain scores could be ob- 
tained by increasing the number of statements of each type 
used in the test. These suggestions were used in building 
Form 2.51 by including in each of the ten exercises 15 state- 
ments which constituted a definite pattern of types of inter- 
pretations and types of responses expected. An effort was 
ach exercise at least one statement in- 
ationship used in the test, but state- 
ments including extrapolation, interpolation, and sampling 
were used in greater number. The entire test was thus 
lengthened from 119 statements in Form 2.5 to 150 state- 
ments in Form 2.51 and the probably true or probably false 
response was expected in 40 per cent of the statements. 
The latest form of Interpretation of Data test (Form 2.52) 
was intended to be comparable to Form 2.51. An effort was 
made to match the form of presentation, types of interpreta- 
tions, topics with which the data deal, and types of response 
expected. Each of the two forms was administered within a 
week to 105 students of the tenth grade, 133 students of the 
eleventh grade, and 99 students of the twelfth grade of two 
large high schools. The coefficient of correlation between the 
two forms of the test for each category was computed by the 
product-moment method. These coefficients, together with 
means and standard deviations on each category for both 
tests, are given in Т. 
Although these corr 
they are no higher may 
servation that more rigorous st 
Form 2.52 and that some source 
1 2.51 were eliminated. 
ers were interested in measuring the abili- 
‘chool students in interpreting data, a 
for students of this grade level. The 
of data were similar to those used 
vice of junior high school teachers 


made to include in e 
volving each type of rel 


able 1 below. 
elations are fairly high, the fact that 


be partially explained by the ob- 
andards were used in keying 
s of ambiguity found to be 


present in Forn 

Since some teach 
ties of junior high s 
form was developed 
criteria for the selection 
in Form 2.52, and the ad 
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and librarians was sought in checking the appropriateness of 
the data and the interpretations for students of this level. As 
a result of this advice, an attempt was made to simplify this 


TABLE 1 
Means and Standard Deviations for Tests 2.51 and 


tions between Forms 2.51 and 


; Product-Moment Correla- 


| 
е рг | Insuf Ü Be- |Crude 
m era nsul. ТЕ AU- " 1 E ©. 
Category Accu-| PF | Data ton | Baew | sens 
racy 
| 
2.51 | 40.1 | 26.5 | 38.6 | 52.3 | 27.8 | 51.4 | 16.9 
еши 2.52 | 45.2 | 24.4 | 53.7 | 70.6 | 26.4 | 37.5 | 13.8 
Standard 2.51 | 10.4 | 14.0 | 15.2 | 16.2 | 11.0 | 12.4 |. 5.97 
Deviations: | 2.52 | 11.4 | 14,4 | 18.4 | 16.2 12.7 | 13.4 |. 630 
Гэ, 2.52 85| .84 àÁ.83  .74  .85 gi .65 


instrument, in comparison with Form 2.52 


» in vocabulary, in 
types of responses expected, 


in number of interpretations 
used, and in problem areas or concepts involved. A prelim- 
inary form (2.7) was constructed and administered, and after 
a statistical study of the responses, the suggested improve- 
ments were incorporated in the present test, Form 2.71. 

This test contains ten sets of data, each of which is fol- 
lowed by ten interpretations. The data deal with problems 
of safety, budgeting, sports, choice of vocation, cost of living, 
etc. The student is required to make thr 
judging these interpretations. These 
tions of the test as follows: 


ee distinctions in 
are given in the direc- 
A. Enough informatio 

B. Not enough infor; 
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caution, beyond data, crude errors, accuracy with true— 
false, and accuracy with insufficient data. Reliability coeffi- 
cients were computed by the Kuder-Richardson formula for 
five populations drawn from each of grades seven, eight, 
and nine.'? For these 15 populations the reliability coefficients 
of the beyond. data and insufficient data scores are of the 
same order of magnitude as are those of the general accuracy 
score. The reliability of the other scores analogous to those 
of Form 2.52 are a little lower with the exception of those for 
, as one might expect, are erratic and 


crude errors which 
me general pattern is found 


tend to be rather low. This sa 
for each grade. 


VALIDITY OF THE INTERPRETATION or Data TESTS 


Two main aspects of the validity of the interpretation of 


data tests will be considered: (1) the validity of the tests 
as a measure of the students’ ability to judge interpretations 
formulated by others, and (2) the validity of the tests as an 
index of students’ ability to write original interpretations. 


Ability to Judge Interpretations Made by Others 

The validity of this test as a measure of the ability to judge 
interpretations made by others depends upon several factors: 
(a) the correspondence between the behaviors demanded of 
students in the test and the behaviors defined in the state- 
ment of the objective, (b) the adequacy of sampling relative 
to form of presentation, to problem areas with which the 
data are associated, and to types of interpretations, (c) the 
appropriateness of the test as to difficulty for the high school 
level. 

In considering the first point, it should be recalled that the 
test is so constructed #s to afford the student an opportunity 


12G, F. Kuder and M. W. Richardson, ^The Theory of the Estimation 
of Test Reliability," Psychometrika, Vol. 2, No. 8 (Sept., 1937), pp. 151- 
160. Throughout this report, wherever the Kuder-Richardson Method is in- 
dicated, case III of this method was used. These and other data on Form 


2.71 will be found in the Appendix. 
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to demonstrate the two main behaviors defined in the objec- 
tive, namely, the ability to perceive relationships in the data 
and the ability to recognize the limitations of the data. To 
verify this, it will be necessary to review briefly the method 
of construction of the test. Incorporated in the interpretations 
which the student is asked to judge are the various types of 
relationships, such as trends, comparisons, etc., that he is 
expected to perceive, expressed in such a way as to have 
varying degrees of substantiation from the given data. Thus 
Some statements are intended to be fully established or con- 
tradicted by the data alone, some statements if properly 
qualified are partially established or contradicted bv the 
data, and others are unjustified without the use of informa- 
tion from other sources, The five-point response by which 
the student indicates his judgment of the interpretations 
forces a response by the student from which the extent of 
his recognition of the limitations of the data and his percep- 
tion of relationships may be inferred. 

It should also be recalled th 
data were determined by the 
committee. Their know! 


at the criteria for selection of 
judgment of members of the 
ledge of types of materials that stu- 
dents read and an analysis of the types of data commonly 
found in curricular and other reading materials form the 
basis of their judgment of the adequacy of the sampling of 
forms, of presentation, of problem areas, and of types of in- 
terpretations. The analysis made by E. W. Hellmich of text- 
books for social studies in the junior and senior high school 
and in elementary college courses indicates that the subject 


matter and types of presentation of the data used in Test 
2.52 are those which students encounter 1? i 


1 Eugene үу. Hellmich, mentary Social Studies 
in Secondary Schools and ge, Columbia University, 
Contributions to Education, No. 706, 1937, Studies in other fields report 
similar results: for example, Robert С. Scarf, Mathematics Necessary for 
z Master's Thesis, The University of Chicago, 
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The appropriateness of the test for the high school level 
can be considered in terms of two sources of evidence. First, 
the interpretations represented by the statements in the test 
are of the types students are found to use when they make 
ations of data. Secondly, study of the dis- 
s who have taken the test 
ade to the junior col- 


their own interpret 
tribution of scores made by student 
shows that no student from the ninth gr 
lege level has received the maximum score possible, nor is 
there concentration of scores at the lower end of the range. 
The distribution of scores is symmetrical with concentration 
of scores at the mean, and, in general, the means tend to 
increase with grade level. 


Ability to Make Original Interpretations 

Although teachers are interested in appraising students’ 
ability to judge interpretations made by others, many teachers 
wish also to measure the students’ ability to make their own 
interpretations. In order to use scores on the interpretation 
of data test as an index of the latter ability, there must be 
evidence of a high correlation between scores on the test 


and judgments of the students’ ability to make original in- 


terpretations. To obtain such evidence, attempts were made 


in earlier studies to validate the interpretation of data test by 
using free essay responses of students as a criterion. For ex- 
ample, in a study conducted in a large public junior high 
school in which 193 students of seventh, eighth, and ninth 
grades participated, the students were given re sets of data 
taken from an Interpretation of Data test for ез p aee high 
school level (Form 9.71) and were asked to make free essay 


responses following such general a, as: bale 
à ding а 
Statements that ou are sure are true according to the facts 


given in these байа,” And “Write three statements based on 
Б ese data, * $ е ЕТЕ 
the data which you are not quite sure are true according to 
«td 
ng this essay form is indi- 


Y 
these data.” « 
The objectivity secured in gr ir 
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cated by the values of the product-moment coefficients of 
correlation among the three judges who marked the papers. 
These values ranged from 0.92 to 0.96. Table 2 below gives 
the values of the product-moment coefficient of correlation 
between Form 2.71 and the essay form, and the reliabilities 
of each form of the test. 


TABLE 2 


Statistics for General Accuracy Score of Test 2.71 


Reliability Cochi- Reliability Cocffi- 
Product-Moment cient of | cient of Test:2.71 
; Corrclation Form by Split- | Se 
Grade | N ч by Kuder- 
between Test 2.71 | Halves Method Richardson 
and Essay Form with Spearman- Matkad 
Brown Correction "е 
7 68 0.69 0.88 0.80 
8 60 0.58 0.73 0.87 
9 65 0.44 0.79 0.91 


The correlations between the two forms were positive and 
sufficiently large to warrant a further investigation of the re- 
lationship between the behaviors involved. 
Although a wide range of relationships, such 
sons and recognition of trends, was found in th 
made by students, as a rule the free responses made by any 
one student involved a narrow range of relationships, and 
919 not sample adequately his ability to make various types 
of interpretations. In the next study, directions on the essay 
form of the test were changed in an effort to encourage the 
student to include a wider range of relationships in his inter- 
pretations. The new directions posed a series of questions 
designed to direct the attention of the student to the various 
types of relationships found in the interpretations given in 
Form 2.52. For example, after each of the following inter- 


as compari- 
e statements 
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pretations is the. question which corresponded to it in the 
essay form: 


la. The ratio of agricultural production to the number 
of farm workers increased every five years between 
1900 and 1925. (Comparison of trends) 

1b. In terms of these data alone, what do you believe 
you can say concerning (a) the change in number 
of farm workers employed compared to (b) the 
change in volume of farm production throughout 
the period recorded in the chart? 


2a. The increase in agricultural production between 


1910 and 1925 was due to more widespread use of 
farm machinery. (Cause) s 
2b. In terms of these data alone, 
you can say about the ca 
volume of farm production between 1910 and 1925? 
3a. The average amount of farm production was higher 
in the period 1925 to 1930 than in the period 1920 
to 1925. (Extrapolation) 
3b. In terms of these data alone, what do you believe 
you can say about the volume of farm production 
during the period from 1925 to 1930? 
This study was made with two populations of ninth, tenth, 
eleventh, and twelfth grade students. One group consisted of 
119 students from a large public high school and the other 
was made up of 99 students from a smaller private high 
school The essay form was’ administered first, followed 
within a week by ‘the regular form of Form 2.52. 

The essay responses were scored and summarized so that 
statements involving each type of relationship could be clas- 
sified as accurate, “beyond the data, cautious, involving a 
crude error, or unable to see the relationship. In scoring, it 


was possible by the use of a simple set of rules to score papers 


so objectively that correlations of the scores given inde- 


what do you believe 
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pendently by three markers ranged from .94 to .96. The addi- 
tional time required to answer the essay form made it neces- 
sary to sample the types of relationships and the types of 
data used in Form 2.52. Seven questions were formulated for 
each of six of the ten exercises in Form 2.52; each of the 42 
questions thus formulated corresponded in subject matter 
and type of relationship to a statement used in that test. Only 
39 answers were scored in the essay form because three ques- 
tions were later found to be ambiguous. These were a fair 
sample of the whole test, since a product-moment correlation 
coefficient of .85 (uncorrected for overlapping) was obtained 
between the "general accuracy" score on these 39 items and 
on the entire 150 items of Form 2,52, Since the correlation 
between the part and the total test was desired as a measure 
of the adequacy of the sampling, 
ping was made. There was also a product-moment correla- 
tion coefficient of .96 between the general accuracy scores of 
the entire ten exercises of Form 2.52 and the six exercises 
from which these 89 items were taken. However, there does 
appear to be some difference in the difficulty of the 39 items 
and of the total test. The mean general accuracy score for the 
39 items was definitely higher than that for the total 150 
items for each of the two different populations of approxi- 
mately 100 high school students, In spite of this difference, 
however, the sample appeared to be sufficiently representa- 
tive for use in this validity study, 

The scores on the essay -form were correlated by the 
product-moment method with scores on similar categories for 
Form 2.52. The results are given in Table 3 below, 


The reliabilities of the essay form for these populations 
were computed by the Kuder-Richardson formula and are 
found in Table 4, Reliabilities for F orm 2.52 will be found in 
Table 5 under the discussion of reliability, — , 


Since the correlation coefficient is to be used as à measure 


no correction for overlap- 
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TABLE 3 


General И Proba 
Score eneral | Beyond NIU IE CE. UM 
Accu- Da Caution cient True 
racy ata Error False Data Probably 
False 
gs | 
Statistic No [eor [tears] И Bore mo qnem] Т [fee T qnem 
Small Private | | 
School o9 |.72 |.so |.60 j.65 |-50 |.55 |.22 |.56 |.37 |-47 |-64 lm |.5з |.63 
eraro эн d esce | 
Large Public | | | || | | ll dud 
School vto 1.74 1.83 1.47 1.52 Jest |,57 |.08 |.12 |.58 -77 |-58 |.65 |.55 |.66 
| i 


гео, refers to reliability coefficient corrected for attenuation due to the unreliability of the 


criterion. 
TABLE 4 
Reliabilities by Ruder-Richardson Formula 
for Two Populations on Essay Form 
| Prob- 
Gen- ably 
eral Be- | Cau- | Crude} Truc- Із p, 
Score yond | |. cient 
Ac- tion | Error | False Prob- 
Data Data 
curacy ably 
False 
Small Private School | .81 85 Jm | ws .61 .82 .70 
Large Public School | .80 | .82 | -81 | -% .57 | .80 | .70 


of validity (that is, of the degree to which the ability to 
make original intefpretations of data can be predicted from 
à score on Form 2,52), it does not seem legitimate to correct 
for attenuation due to the unreliability of Form 2.52. The 
relation between the 'theoretical ability to judge interpreta- 
tion and the theoretical ability to make original interpreta- 
tions is not at issue, but rather how well Form 2.52 predicts 
the latter ability. Hence, it seems defensible to correct for 
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the unreliability of the criterion but not for that of Form 
2.52. As seen in Table 3, such correction vielded validity 
coefficients of .80 and .83 for the general accuracy score, and 
lower values for the other categories. A validity coefficient of 
80 is sufficiently high for group predictions and is of some 
value for study of individual students. Thus Form 2.52 can 
be used as an index of the general accuracy with which a 
group can make original interpretations of data. For the pop- 
ulations used in this study, its validity as an index of the 
types of errors into which students fall in making original 
interpretations was not high. 

Some differences in the two forms of the test are apparent. 
In the essay form the student could respond with more than 
one statement or could make an irrelevant statement—that 
is, a statement in which he failed to involve the relationship 
to which the question was intended to direct his attention. 
There was no opportunity in Form 2.52 to react in either of 
these ways. However, since the relevant responses to each 
question on the essay form were Scored as a whole on the 
basis of the main thought expressed, the number of extra 
statements did not affect the score. The irrelev: 
affected the score on general асси. 
an omitted item would have 
A study was made to deter 
the essay form to respond 
be'an important factor 


ant statements 
acy in the same way that 
affected this score on either form. 
rmine whether the opportunity in 
with irrelevant statements might 
affecting the correlation between the 
two, instruments, The correlation coeffizient between the 


general accuracy score of the essay form and all of the corre- 
sponding 39 items of Form 9 
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.78. This seems to indicate that the opportunity to make 
irrelevant responses on the essay form may be one of the 
factors that limits the correlation. 

The comparison of patterns of responses for the same indi- 
viduals on the two test forms suggests another likely hy- 
pothesis to account for the differences in results. Many stu- 
dents apparently employed somewhat different standards in 
making original interpretations than they used when judging 
interpretations of data made by others. Students’ behavior in 
this respect may be classified into the following patterns: 


a. The student reacts similarly on corresponding items 
of the two forms. 

b. The student is overcautious on an item in judging in- 
terpretations made by others but goes beyond the 
data on the corresponding item in making his own 
interpretations. The reverse pattern also appears. 

c. The student is either very cautious or goes beyond 
the data in judging interpretations made by others, 
but is accurate when making his own interpretations. 
Here also the reverse pattern appears. 


Of these patterns, the first appeared most frequently, as might 
be expected from the high validity coefficients. Extreme dis- 
crepancies between reactions on corresponding items of the 
two tests (as described in pattern b) appeared very infre- 
quently. In pattern c, students tend to go beyond the data 
more in making their own interpretations of data than in 
judging interpretations made by others. 

While other factors may be present, the differences be- 
tween the essay form and Form 2.52 may in part be at- 
tributed to the opportunity in the essay form to make irrel- 
evant statements, and to the tendency of some students to 
use different standards in reacting to corresponding items of 
the two forms, 
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RELIABILITY OF THE INTERPRETATION OF Data TESTS 


The most comprehensive study of reliability of Form 2.52 
was made by the use of the Kuder-Richardson formula with 
19 populations from grades nine, ten, eleven, and twelve in 
seven schools. The reliabilities for the two populations used 
in the validity study were of special interest and are given in 
Table 5 below. The means and standard deviations for these 
two populations are listed in Table 6 below. 


TABLE 5 


Reliabilities by Kuder-Richardson Formula 
on Form 2.52 for Two Populations 


| 
| Prob- 
Gen- | ably 
Score N eral k Cau- | Crude} True- НЕШ ue 
Ac- Data tion | Error | False Dat Prob- 
curacy : ata | ably 
False 
|. 
Small Private 
School 
Grades 9, 10, 
11, 12 99 | 0.93 | 0.91 | 0.91 | 0.75 | 0.78 | 0.92 | 0.88 
Large Public 
School 
Grades 9, 10, 
11, 12 119 | 0.95 0.95 | 0.87 0.81 | 0.84 | 0.90 | 0.88 
| 


It will be noted that the reliability coefficients in all cate- 
gories except crude error and true-false cluster around .90 
for both of these populations and that the general accuracy 
score has the highest reliability. The coefficients tend to 
form the same definite pattern from category to category for 
both populations, and the difference between the coefficients 
for the two populations on any single category is slight. 
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TABLE 6 


Means and Standard Deviations of Per Cent Scores 
on Form 2.52 for Two Populations 


Probably 


| General Insuffi- 
Score | eu. | Beyond | Caution Grade qe ent er 
Era Data | Error | False Dam | Probably 
| | | | | False 
Statistic N i |e PPS LE e d dn е Mjo 
ah els == ——-- 
Small Private | | MEM loud | 
School оо | 56.3. 10.9, 19.6| 11.2) 36.1) 13.5 | 5.3 | 78.3| 15.0] 76.8 16.7) 24.3 141 
Large Public | — || | | | | | | | | | | 
School 119 | 45.9 13.7 47.6 13.8) asd 10.3| 132| 7.0 | 62.0) 17.3) 41.3) 17.5) 34-1 16.1 
(эй ме 11 | 


When the means and standard deviations for the two sam- 
ples are considered, it will be noticed that the group from 
the small private school is in general a superior group as 
measured by Form 2.52. It is also a more cautious group as 
measured by the high mean score on caution and by the low 
Mean score on accuracy with probably true—probaby false. 
Yet in spite of the difference in these two groups, the relia- 
bilities computed from them are very similar. Table 1 in the 
Appendix gives the reliability coefficients for all nineteen 
Populations. It will be noted again that for these populations 
the reliability coefficients of all scores except crude errors 
and accuracy with true and false statements are sufficiently 
high for group interpretation. 

Before Form 2.52 was made, the split-half method was 
used in deriving the’ reliability of Form 2.51. An effort was 
made to split the test into “equivalent” halves by pairing 


items according to definite criteria, such as the response ex- 
g : 


Pected of the student, the types of interpretation involved, 
the topic with which the data dealt, and the form of presen- 


tation of the data. An analysis of the responses of 88 students 


Was used in an attempt to include in each half items which 


Presented these students with the same type of difficulty, but 
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it was not always possible to make an accurate match. The 
correlation between “equivalent” halves of Form 2.51 was 
computed from the scores of another population of 284 stu- 
dents in the three upper grades of two high schools. By 
means of the Spearman-Brown formula it was possible to 
predict the correlation for a test doubled in length. Table 7 
contains these corrected correlations. 

The coefficients obtained from the comparability study 
discussed previously may be considered another measure of 
reliability of the interpretation of data test and are also given 
in Table 7 below. However, the lower values of these coeffi- 
cients are attributable more to the difference between the 
two tests than to the unreliability of either of the tests. 


TABLE 7 
Reliability Coefficients for Interpretation of Data Tests 


П | | 
| Prob- 
Gen- ably 
i Be- Y Insuf- 
Method Population | N | е1 | yong | Cau- |Crude | True- |. ent rect 
Accu- Data | tion | Error | False Data Prob- 
racy | | аа | ably 
| | False 
| | | 
Kuder-Richardson | Grades 9, 10, 
Form 2.52 | 1,12 119 | 0.95 | 0.93 | 0.87 | 0.81 | 0.84 | 0.90 | 0.88 
Comparability Grades 10, | 
Forms 2.51-2.52 11,12 337 | 0.85 | 0.81 | 0.85 | 0.65 | 0.74 | 0.83 | 0.84 
Split-halves | Grades 10, 
Form 2,51 11,12 284 | 0.92 | 0.91 | 0.91 | 0.82 | 0.86 | 0.92 | 0.87 


When the reliabilities obtained by thé three methods are 
compared, it will be noted that the coeflicients computed by 
the Kuder-Richardson formula and by the split-halves 
method are approximately the same and that, as would be 
expected, the coefficients computed" from scores on “com- 
parable” forms are smaller for all categories. These reliabili- 
ties were considered rather high in view of the complexity of 
the behaviors involved. 
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IL APPLICATION OF PRINCIPLES OF SCIENCE 
ANALYSIS OF THE OBJECTIVE 


Teachers of science in schools of the Study believed that 
students should learn to apply knowledge obtained in the 
science classroom and laboratory to the solution of problems 
as they arise in daily living. This aspect of critical thinking 
was frequently mentioned in the list of objectives submitted 
to the Evaluation Staff. A study of the prevailing curriculum 
materials for science instruction confirmed the importance of 
this objective, and therefore a committee was formed for the 
purpose of clarifying it and of aiding in the development of 
evaluation instruments for appraising growth in the ability 
to apply science information. Although this objective had 
previously been explored to some extent at the college level 
by Tyler“ and others, and these explorations had served to 
show that certain techniques for the measurement of the 
objective were feasible, it could not be assumed that the 
available analyses and methods were immediately applicable 
at the secondary school level. This committee of teachers in 


the schools therefore aided the Fvaluation Staff in clarifying 
the objective to be appraised and also in finding situations 


which would give students an opportunity to show the de- 
gree to which the objective had been attained. In the present 
instance, clarifying the objective necessitated an analysis of 
the behaviors involved in application and a selection of the 


principles to be used. 


n 


Behaviors Involved in Application 

The analysis of the behaviors involv 
arated the process of applying principles into two steps: 
(1) the student studies à situation and makes a decision 
àbout the probable explanation or prediction which is ap- 


Б ™ Ralph W. Tyler, Constructing Achievement Tests, 
lonal Research, Ohio State University. 


ed in application sep- 


Bureau of Educa- 
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plicable to this situation; (2) he justifies through the use of 
science principles and sound reasoning the explanation or 
prediction that he made in the first step. In the first step he 
acts in the role of an authority who is presented with a 
problem and asked for a solution. In the second step, he is 
asked to explain or justify that proposed solution by means 
of his previous knowledge of what has occurred in similar 
situations. 

The kind of deductive thinking needed for the solution of 
these problems consists of the search for an explanation of 
the fact or facts described in the problem situation by means 
of some general rule which asserts a highly probable con- 
nection between facts of the kind described in the problem 
and other facts the student knows to be applicable to similar 
problems. The question he attempts to answer is: Does the 
general rule which is suggested by the given facts as an hy- 
pothesis for explaining what has happened (or what will 
happen) actually apply to this specific problem? The answer 
to this question comes, of course, from experimentation or 
direct observation. However, if observations have been made 
in several situations which can be shown to be similar to that 
one which is described in the test, then without obtaining 
the empirical evidence one may nevertheless predict with 
considerable confidence that the same conclusion is also true 
in this case. It was for the measurement of such behavior that 
the instruments to be described later were constructed. The 
teachers felt they needed the most help in evaluating the 
ability of students to apply principles in. new situations, and 
consequently the remembering of applications which had 
been made was not included as a behavior to be directly 
appraised. А 
Selection of the Principles 


In the discussions that were 1 


held to clarify the meaning 
of the term principle it was fou 


nd that some teachers were 
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inclined to accept certain statements as representing “prin- 
ciples” whereas others wanted to regard them as statements 
of “facts.” The difficulty was resolved by obtaining an agree- 
ment which permitted, for the purpose of testing application, 
the use of any science information, fact, generalization, un- 
derstanding, concept, or “law” which proves to be useful 
(alone or in connection with other information) for predic- 
tive or explanatory purposes. Although more inclusive than 
the definition of principle that is frequently used by science 
teachers, this agreement seemed satisfactory for the measure- 
ment of the objective as this committee conceived it. After 
the committee had accepted this agreement as to the “prin- 
ciples” which were to be used in the construction of the in- 
struments, teachers were asked to submit statements of those 
principles which were considered important in their courses 
and which had received the greatest emphasis in their teach- 
ing. These lists included the principles with which their stu; 
dents had had the greatest opportunity to become familiar 
through reading, discussion, and experimentation. 

The original lists from individual teachers included princi- 
ples from the fields of chemistry, physics, and biology, as 
well as some that were common to all three fields. After the 
principles submitted had been classified into subject-matter 
areas, the complete list was sent to a number of teachers in 
the Thirty Schools. These teachers were asked to: 
ents that they would expect their 


1. Select those statem 
predictions or explana- 


students to apply in making 


tions in new situations. 
2. Select those statements that they would expect their 


students to know in a general way, but not to the ex- 
tent of being able to use them to make predictions in 


new situations. 


Only those ;principles which were included in the first 
Category by at least three-fourths of the teachers were con- 
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sidered for use in the tests. Two additional criteria were 
established to aid in the selection: 


3. The principle should have a wide range of applica- 
bility to commonly occurring natural phenomena. 

4. The principle, with examples of its application to 
commonly occurring phenomena, should be found in 


all of the science textbooks commonly used in these 
schools. 


The teachers were also asked to judge the relevance of each 
principle to the areas of general science, biology, chemistry, 
or physics, or to all of these areas. 


THE DEVELOPMENT OF EVALUATION INSTRUMENTS 


During the period of the Eight-Year Study a number of 
instruments were developed for evaluating the ability to 
Apply principles. Several of these instruments included prin- 
ciples drawn from the subject-matter area of general science; 
others were restricted to principles drawn from physics, 
chemistry, or biology. Because the instruments which in- 
cluded principles from general science were used more ex- 
tensively than the others and because they were the ones 
experimented with in attempting to arrive at a satisfactory 
pattern for the test, they will be used to illustrate the con 
struction of tests of application of principles. 


Preliminary Investigations 


» In preparing a test of Application cf Principles, the first 
step after the principles had been selected was to obtain 
problem situations to which the student might react. Teachers 


were asked to submit to the committee problem situations 
which: e 


l. were new to the students 
narily discussed in the clas 


books); 


(i.e, they were not ordi- 
sroom or used in the text- 
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. occur rather commonly in actual life; 

. could be explained by the principles which the 
teachers had selected as important for their students 
to apply. 


бо о 


Attempts to phrase the problem situations revealed that they 
might be so described as to demand several different types 
of response from the student. Four types of response were 
used; namely, making a prediction, offering an explanation 
for an observed phenomenon, choosing a course of action, 
and criticizing a prediction or explanation made by others. 
An illustrative situation of each type follows: 


1. A farmer grafted a Jonathan apple twig on a small 
Baldwin apple tree from which he had first removed 
all the branches. The graft was successful. If a new 
branch develops from a bud below the point of the 
graft and produces apples, what kind of apple will it 
be? Here the student is asked to make a prediction 
about a situation in which presumably he has had no 
actual experience. It is presumed that if he under- 
stands certain laws of heredity, he will be able to 
make a valid prediction. 

2. All of the leaves of a growing green plant were ob- 
served to be facing in the same direction. Under what 
conditions of lighting was the plant probably grown? 
This example requires that the student offer an ex- 
planation of:an observed phenomenon. Some knowl- 
edge of the principles of photosynthesis, growth, end 
tropistic responses of plants would be required for the 
solution of this problem. 

8. The rear of an automobile on a wet pavement is skid- 
ding toward а ditch. If you were the driver 
of the car, what would you do to bring the car out of 
the skid? This problem requires the student to choose 
a course of action. A knowledge of the principles of 
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- centrifugal force and Newton's laws of motion would 
enable the student to choose a satisfactory course of 
action. 

4. It was reported in a newspaper that in order to tow 
down a river a huge oil drum filled with air, the work- 
men found it necessary to fill the drum with com- 
pressed air to increase its buoyancy. Do you believe 
that this would increase the buoyancy of the oil 
drum? This problem asks the student to criticize an 
explanation which has been given. Knowledge of the 
fact that air has weight and of the principles of 
buoyancy are required for a satisfactory solution in 
this problem. 


In none of these problems were the answers expected to 
be in exact quantitative terms; rather a qualitative under- 
standing of the general outcome was required. It was thought 
that the kind of activity shown by students in making a pre- 
diction of this kind was of more importance for “general 
education than one which required exact substitutions of 
numerical data in a formula or similar activities frequently 
used in the laboratory. One often encounters problems in 
which a principle is used to explain what happens in general 
when certain factors are varied in the situation, while the 
need for numerical solutions of problems occurs relatively 
infrequently for most people. Although the above problem 
situations are stated in such a way that the student is ex- 
pected to react somewhat differently in each, it is not likely 
that he will react intelligently to any of these situations un- 
less he has a knowledge of the principles operating and has 
recognized their application to the problem. Whether he 
criticizes a prediction made by someone else or makes the 
prediction himself, he must base his answer upon the knowl- 
edge which he feels is applicable to the situation. 

The next step in constructing the test was to determine the 
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reasons which might justify the response to the problem 
situation, and to find a means of appraising the reasons cited 
by the student. Science teachers were in rather general agree- 
ment that the most valid of all the reasons a student might 
use for justifying his conclusions would be those that cited 
established scientific facts, principles, and generalizations. 
However, in addition to these, it was agreed that the student 
might cite from his experience, from authoritative materials 
he had read, or he might use analogous situations familiar to 
the person to whom he was explaining his decision, provided 
these experiences, authorities, or analogies were pertinent to 
the situation he was attempting to explain. 

In order to determine whether or not students did use 
these kinds of reasons, they were asked to write out both 
their own predictions, choice of action or responses to the 
situation, and all of the reasons that they believed would 
support the decision they had made. When these papers were 
analyzed by the teachers and the Evaluation Staff, the types 


of acceptable reasons which had been anticipated were 


found in the students’ responses. However, in addition to the 
acceptable, certain 


reasons which were agreed upon as being 

types of errors were also found to occur rather consistently 
in the written responses of the students. It was found that 
students frequently used teleological explanations and 
analogies not closely correspondent to the situation de- 
scribed in the problem. They cited authorities that were ques- 
tionable, ridiculed positions other than their own, stated as 
facts certain misconceptions or superstitions, merely restated 
either the facts given or their own prediction, and made less 
frequently a variety of other types of errors. They also used, 
in addition to the principles and facts judged to be accept- 
able and necessary to the explanation of the problem, other 
facts and principles that were irrelevant to the solution of 
the problem.’ The frequency with which each of these types 
of reasons was used was not constant, but varied from class 
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to class and from problem to problem. In examinations of 
sufficient length given to a large number of students, how- 
ever, these types of errors were found to be most prevalent. 

In general, it was possible to infer that the errors were 
made because: 


1. The student did not know the principles. 

2. He did not see that a principle he knew applied to 
the situation. 

8. He knew the principle and knew that it applied to 
the situation, but he was unable to explain adroitly 
how or why it applied. . 

4. He used teleology, poor analogy, or poor authority, 
rather than (or in addition to) correct facts and 
principles. 

5. Although his explanation was correct as far as it was 
given, he cited facts and principles which were in- 
adequate for a convincing proof for a given selected 
conclusion or course of action. 

6. He confused closely related principles, only one of 

which was applicable to the problem. 

. He used irrelevant material. 

8. He neglected to study the description of the situation 


carefully enough to note all of the limiting factors in 
the description. 


-1 


This list does not include all of the reasons why students 
made errors but it does help to show why it was difficult to 


"n . 
score the written responses. 
Construction of Early Short-Answer Forms 


The same problems of objectivity of scoring and of ade- 
quate sampling that are found in any essay-type test were 
inherent in these written responses. The teachers found that 
it was difficult to differentiate among those acceptable uses 
of generalizations, facts and principles which were relevant 
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to the problem, and the logical errors, obscured as they some- 
times were by illegibility of handwriting and by awkward 
literary style. It was also difficult to decide when a student 
had cited enough evidence to support his choice of answer. 
A second criticism of this form of test was that it limited the 
number of principles which could be sampled because of 
the time required by the student to write out the answers. 


Because of these difficulties, a more objective means of test- 


ing this same ability was sought. 
Following a study of the responses written out by students, 
the first of a series of objective test forms in this area was 


made. The objective form of the test asked the student to 


select from a list of predictions for each problem situation 
t likely to be true, and 


the one which he thought was mos 
then to select from a list of reasons those which would be 
necessary to establish the validity of his choice. The predic- 


tions and reasons used in the test paralleled those which 


had been used frequently by the students when they wrote 
mental groups were given 


essay-type responses. When experi 

an examination which required them to write out their pre- 
dictions and reasons for the first half of the testing period, 
and an examination in which they were required to select 
the correct prediction and the reasons which supported it 
from a given list during the latter half of the period, it was 
found that the results on the two types of examinations were 
quite similar. The coefficient of correlation was in all cases 
above 0.80? The advantages of more objective scoring and 
more extensive sampling of problem 
doption of the objective form. 

ü t cting age qus td 
tonal, Research, Ohio, SUE ferc Bulletin, XVI (Jan. ta 
1937); Louis E. Raths, “Techniques of Test Construction," Educational 
Research Bulletin, XVII (April 13, 1938); Louis M. Heil, "Evaluation of 


Student Achievement in the Physical Sciences—The Application of Laws 
and Principles," The American Physics Teacher, VI (April, 1938). 


the possibilities for 
Situations led to the a 
? Ralph W. Tyler, Constru 


86 


ADVENTURE IN AMERICAN EDUCATION 


The procedures used in preparing the early form of ob- 


jective tests in this area were as follows: 


. Finally, 


1. The principles to be used in the test were selected 


in accordance with the criteria formulated by the 
teachers interested in this objective. 


. Problem situations in which these selected principles 


would apply were chosen with the following criteria 

in mind: 

2.1 They were to be new in the sense that they had 
not been used in the classroom or laboratory. 

2.2 The situation should approximate a rather com- 
monly occurring life situation. 

2.3 The problem should be significant to students in 
that its solution might help them to solve similar 
problems which occur in their everydav living. 

2.4 The vocabulary used should be at an appropriate 
level for the students taking the test. They should 


be able to understand the description of the 
Situation. 


‚ Several (usually three or тоге) plausible answers for 


the problem were formulated. These might be in the 
form of predictions, courses of action to be taken, 
causes to be stated, or an evaluation of one of these 
when it was given. Actually, when possible answers 
were suggested by listing them in the test, the proce- 
dure in every case would be one of evaluation through 
the selection of what the student thought was the 
most desirable, whether it was a prediction, course 
of action or explanation for the phenomena which 
had been described in the problem. 


reasons of the sort used by students were 
listed, including for each Situation those common 
types of errors which students made when they wrote 
out their reasons. In addition to correct statements 


-—— 


APPRAISING STUDENT PROGRESS 87 


of scientific principles needed for a satisfactory ex- 
planation, the following types of statements were 


formulated: 


41 


4.3 


4.4 


False statements purporting to be facts or prin- 
ciples. These, if accepted as true, would support 
one of the alternative conclusions. For example, 
if the correct principle stated that a direct rela- 
tionship existed between two phenomena, one 


might word a false statement in such a way as to 


indicate that there was no relationship or that 
the relationship was an inverse one. To remain 
consistent in his reasoning, the student can use 
such a statement only to support a conclusion 
other than the acceptable one. 

Irrelevant reasons. These statements are true, but 
either they have no relationship to the phenom- 
enon described in the problem or they are quite 


unnecessary in the explanation of the phenom- 


enon. 
False analogies. These stated directly or inferred 


that the phenomenon described in the problem 
was identical with, or very much like, some other 
known phenomenon when it actually had little 
or nothing in common with it; therefore, an ex- 
planation for one phenomenon would not be 
acceptable for explaining the other. Metaphors 
were sometimes included as an example of a more 

in that the analogy was 


subtle use of analogy. 


implied by the use of words but not definitely 


expressed. 
Popular misconceptions. These included the more 


common beliefs based upon unreliable evidence 
or false assumptions. Frequently they were state- 
nients of rather common practices based upon 


accepted but unreliable evidence. Common 
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Directions: In each of the following exercises 
Below each problem are two lists of statement 


An example 
tests and 
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clichés or superstitions would also be included in 
this type of statements. 


‘5 The citing of unreliable authorities. Statements in- 


troduced by phrases such as “Science SAYS. es sa 
or "People say . . . ," or “It is reported in pop- 
ular magazines that . . .” were used. Here a dis- 
tinction must be made between such very gen- 
eral or unreliable sources and those which might 
be used with considerable assurance. However, 
in any case the mere citation of authority did not 
in any sense explain why a particular point of 
view was correct; one would need in addition 
to give the evidence used by this authoritv to 
establish his position on the outcome of the 
problem. 


4.6 Ridicule. This rather common device of students 


in their explanations suggested that any position 
contrary to their own could only be held by some- 
one who did not know the facts. 


4.7 Assuming the conclusion. These statements as- 


sumed what was to be proved. This was most 
frequently represented in these tests by essen- 
tially repeating the conclusion by rewording it 
without changing its meaning. 


4.8 Teleology. These statements assume that plants, 


animals, or inanimate ob 


jects are rational or 
purposive, 


of the wording of the directions for one of the 
a sample problem taken from the test follow. 


Form 1.3 t 


APPLICATION OF PRINCIPLES 


a problem is given. 
s. The first list con- 
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tains statements which can be used to answer the problem. Place 
a check mark (\/) in the parentheses after the statement or 
statements which answer the problem. The second list contains 
statements which can be used to explain the right answers. Place 
a check mark (\/) in the parentheses after the statement or 
asons for the right answers. Some of 
the other statements are true but do not explain the right an- 
swers; do not check these. In doing these exercises then, you are 
to place a check mark (V) in the parentheses after the state- 
ments which answer the problem and which give the reasons for 


the RIGHT answers. 


statements which give the re 


ho do not have refrigerators some- 
times wrap a bottle of milk in a wet towel and place it where 
there is a good circulation of air. Would a bottle of milk so 
treated stay sweet as long as a similar bottle of milk without a 
wet towel? . 
A bottle wrapped with the 
a. longer than without the wet towel..( ) a 
b. notas long as without the wet towel.( ) b. 
c. the same length of time—the wet 
towel would make no difference....( ) c. 


In warm weather people w 


wet towel would stay sweet 


Check the statements below which give the reason or reasons 


for your explanation above. 
. Thunderstorms hasten the souring of 


Superstition d 

ЯНЕ ареал КАД: Ба жыйа ( jd 
Right Principle e. The souring of milk is the result of 

the growth and life processes of bac- 

feria. nan ЫЛ Da diea e eter ( ўе. 
Wrong f. Wrapping the bottle prevents bac- 

teria from getting into the milk..... C JE 
Wrong g. A wet towel could not interfere with 

the growth of bacteria in the milk..( ) g. 
Wrong h. Wrapping keeps out the air and hin- 


ders bacterial growth —— m" ( УЬ 
Right Principle i Evaporation is accompanied by an 
absorption rc ETATE tenis MED. 
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Authority j. Milkmen often advise housewives to . 
wrap bottles in wet towels......... t jj 
Unacceptable k. Just as many foods are wrapped in 
Analogy cellophane to keep in moisture, so is 


milk kept sweet by wrapping a wet 
towel around the bottle to keep the 


moisture in awase sonadenn cn t Jk 
Right Principle]. Bacteria do not grow so rapidly 
when temperatures are keptlow....( ) L 


In formulating statements for these earlier test forms, no 
consistent pattern was followed. A study of the results ob- 
tained by giving Form 13 to many science students sug- 
gested the desirability of using in each of the testing situa- 
tions a pattern of reasons which would remain constant 
throughout the test. It was believed that this would tend to 
give a greater reliability to the Subscores used in interpreta- 
tion and thus make the interpretations more meaningful. The 
pattern of reasons to be included was determined through 
discussions with teachers who had used Form 1.3. They were 
asked to indicate the types of items in the test which seemed 
to be most useful in diagnosing students' difficulties. Using 


their Suggestions, tests employing a pattern of responses 


ugh these steps: Situa- 
н а described for Form 1.3 
but with greater emphasis upon p 


ions were worded in a way that would 
require an explanation, prediction, choice of ¢ 
or'an evaluation of 


then formulated, o 


| the test were arrived at by first sup- 
porting the correct conclusion by formulating three state- 


S which Support it and by implica- 
WO conclusions, F, our wrong reasons 
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which, if accepted as true, would support the other conclu- 
sions were next formulated. Two of these would tend to 
support one of the wrong conclusions and two the other. 
They would all tend by implication to eliminate the right 
conclusion. One statement was formulated so as to be true 
but irrelevant to the explanation of the problem. One each 
of the following kinds of reasons completed the pattern— 
a teleological statement, ridicule statement, assuming the 
conclusion, unacceptable analogy, unacceptable authority, 
and unacceptable common practice. Each of these was 
worded to appear to be consistent with the conclusion keyed 
as right. Tests following this general procedure were con- 
structed for the areas of chemistry (Form 1.31), physics 
(Form 1.32), biology (Form 1.33), and general science 
(Form 1.3a)."° 

A sample problem taken 
directions and key. 


PROBLEM 

The water supply for a certain big city is obtained from a large 
lake, and sewage is disposed of in a river flowing from the lake. 
This river at one time flowed into the lake, but during the glacial 
Period its direction of flow was reversed. Occasionally, during 
heavy rains in the spring, water from the river backs up into the 
lake, What should be done to safeguard effectively and econom- 
ically the health of the people living in this city? 

n which you believe is most con- 
ve and most reasonable in the 
may have, and mark the appro- 
der Problem —. 


from Form 1.8a is given with the 


Directions: Choose the conclusio 
sistent with the facts given abo 
light of whatever knowledge you 
priate space on the Answer Sheet un 


Conclusions: 
У А. During the spring season the amount of chemicals used 
in purifying the water should be increased. (Supported 


by 3, 7, 10, 12) 
B. A permanent system of tre 
10 A junior high school test, Form 1.8], which uses a somewhat different 
and less complex technique was also constructed. 


ating the sewage before it is 


J 
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dumped into the river should be provided. (Consistent 
with 5, 8, 12) | 

С. During the spring season water should be taken from the 
lake at a point some distance from the origin of the 
river. (Consistent with 12, 14) 


Directions: Choose the reasons you would use to explain or sup- 
port your conclusion and fill in the appropriate spaces on your 
Answer Sheet. Be sure that your marks are in one column only— 
the same column in which you marked the conclusion. 


Reasons: 


False 
analogy 


Irrelevant 


Right 
Principle 


Ridicule 
Wrong 
Supporting B 
Authority 


Right 


Wrong 
Supporting B 


Assuming 
Conclusion 


Right 


to 


10. 


. In the light of the fact that bacteria cannot 


survive in salted meat, we may say that they 
cannot survive in chlorinated water. 

Many bacteria in sewage are not harmful to 
man. 

Chlorination of water is one of the least ex- 
pensive methods of eliminating harmful bac- 
teria from a water supply. 

An enlightened individual would know that 
the best way to kill bacteria is to use chlorine. 
A sewage treatment system is cheaper than 
the use of chlorine. 

Bacteriologists say that bacteria can be best 
controlled with chlorine. 

As the number of micro-organisms increases 
in a given amount of water, the quantity of 
chlorine necessary to kill the organisms must 
be increased, 

A sewage treatment system is the only means 
known by which water can be made abso- 
lutely safe. 

By increasing the amount of chlorine in the 
water supply, the heulth of the people in this 
city will be protected. 

Harmful bacteria in water are killed when а 


small amount of chlorine is “placed in the 
water, 
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Tele- 11. When bacteria come in contact with chlorine 


ology they move out of the chlorinated area in 
z order to survive. 
2 12, Untreated sewage contains vast numbers of 
ae bacteria, many of which may cause disease 
P in man. 
d 13, In most cities it is customary to use chlorine 
tice to control harmful bacteria in the water sup- 
ly. 
Wing 14. Boss deposited in a lake tends to remain 
upporting C in an area close to the point of entry. 


An examination of the complete test would show that the 
problem situations included in this form of the test deal with 
genics, conservation, and 


personal health, public health, eu 
nvolve questions of opinion as 


the like, and many of them ir 
well as of the operation of science principles. The desirabil- 
ity of using these types of problem situations was mentioned 
by many of the science teachers who had used the earlier 
form of the test; however. after such problems were form- 
ulated it was discovered that very little agreement could be 
‘to the most defensible con- 
lty is illustrated by 


th 
Ne above problem on j 
ciples might be cited in proposing a solution to the problem 


ОЁ securing for this city а supply of water free from patho- 
ваше bacteria; but whether or not a supply of parer free 
rom pathogenic bacteria constitutes ап effective safeguard 


of the healt 1 whether or not any p»o- 
healtl Se peo le and whe yp 
i af due реГ ater will be 


upply of w 


О: x 
Posed method of securing such a $ ! = 
economical" cannot be determined by science principles 


alone, 
of the three conclusions given with 
ry for the student to interpret the 
jf the student regards 


nomically. 
might be secured by the adminis- 
o 


"ius choosing any опе 
к солей, it is necessa 
Kas s effectively and 26? 

onable safety, such as 
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tration of additional chemicals to the water supply, as an 
effective safeguard, and if he regards the use of chemicals 
as an economical practice, then he might defend conclusion 
A. However, another student might wish to defend conclu- 
sion B by pointing out that the use of chemicals assures only 
a reasonable safety under ordinary conditions and may fail 
under unusual circumstances, such as the sudden reversal of 
flow of the river, and that this practice cannot be considered 
economical in the long run when all the benefits of a sewage 
disposal system are considered. Still another student might 
defend conclusion C as representing a more effective safe- 
guard than that of A and a more economical practice than 
that of B. 

The difficulty of keying any of these responses by students 
as the correct one, unless one knows all of the evidence and 
values which the student would use to support his point of 
view, is obvious. Insofar as the student considers the prob- 
able effects of these practices upon the people living in the 
city, upon the people in nearby regions or in towns lying 
along the river, upon the future as well as the present citi- 
zens of this region, and upon the biological life in the waters 
of this region, he may interpret the terms effectively and 
economically so as to justify any of these three conclusions. 
The pertinent science principles can only aid a person m 
predicting the effects of each of these practices; they cannot 
determine whether or not these effects are to be desired. 
Other students might wish to remain uncertain about which 
cónclusion to choose until further evidence had been ob- 
tained about the problem. Such evidence might reveal that 
it would be better to put into practice all three of the sug- 
gested conclusions, i.e., purify the sewage by a permanent 
system of treatment before it is dumped into the river, take 
the water from the lake at a greater distance from the shore, 
and finally add chlorine to the water before. it is put into 
the water mains. It should be clear from this discussion that 
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the effort to construct a test form which involved social 
values as well as scientific principles led to situations which 
were well suited for generating a desirable type of thinking, 
but which at the same time created considerable technical 
difficulty for the test constructors. In the discussion of the 
next test in this series a method for solving these difficulties, 


at least partially, will be discussed. 


Structure of Form 1.3b 
In developing Form 1.3b two changes were made: (1) 
the adoption of a different form of conclusion and the con- 
s to be used if the student were 


sequent inclusion of reason 
uncertain about the conclusion; (2) addition of acceptable 


analogy and acceptable authority to the reasons to be used 
to support or refute the conclusion. A keyed sample prob- 
lem from Form 1.8b is reprinted here to illustrate these 


changes: 


PROBLEM I 


A motorist driving a new car at night at the rate of 30 miles per 


hour saw a warning sign beside the road indicating a "through 
highway" intersection 200 feet ahead. He applied his brakes 
When he was opposite the sign and brought his car to a stop 65 
feet beyond the sign. Suppose this motorist had been traveling 


at the rate of 60 miles per hour and had applied his brakes ex- 


actly as he did before. He would. have been unable to stop his 


€— ш. Уге: ТТЕР ШАГ : 
car before reaching the “through highway intersection. 


Directions: 

A. If you are uncertain 
lined statement, place a m 
under A. 

B. If you think that the underlined statement is quite likely to be 
true, place a mark in the box on the answer sheet under B. 


the underlined statement, place a mark 


C. If you disagree with 
in the box on the answer sheet under C. 


about the truth or falsity of the under- 
ark in the box on the answer sheet 
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Directions for Reasons: 

If you placed a mark under A, select from the first ten tenons 
given below all those which help you to explain thoroughly why 
you were uncertain and place a mark in Column A opposite each 
of the reasons you decide to use. 

Tf you placed a mark under B, select from reasons 11 through 24 
all those which help you to explain thoroughly why you agreed 
with the underlined statement and place a mark in Column B 
Opposite each of the reasons you decide to use. 

Tf you placed a mark under C, select from reasons 11 through 24 
all those which help you to explain thoroughly why you dis- 
agreed with the underlined statement and place a mark in Col- 
umn C opposite each of the reasons you decide to use. 


Reasons to be used if you are uncertain: 


Lack of 1. I have never driven an automobile at 60 miles 
Experience per hour and don't know how far an automobile 
will travel after the brakes are applied. 


Irrelevant 2. The distance required to bring a car to a stop 
“Control” depends upon the condition of the road surface. 
Irrelevant 3, The reaction time of the driver is an important 
“Control” 


factor in determining the distance a car will 
travel before it stops. 

Irrelevant 4, The mechanical efficiency of the brakes will af- 
“Control” fect the distances fequired for stopping a car. 
Irrelevant 5, Whether the brakes are of the mechanical or hy- 
“Control” draulic type would make a difference in the 
stopping distance, 


Irrelevant 6, There are too ma 


I 1 ny variable conditions in the 

Control situation to enable one to be sure about the stop- 
ping distance, 

Lack of 7. I do not know which mathematical formula to 

Knowledge apply in this problem, 

Irrelevant 8. The distance required to bring а car to a stop 

Control depends upon the mass of the car as well as the 
speed. 

Irrelevant 9. Whether he stopped the car or nct before enter- 

Control” i 


ood a driver he was, 
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Irrelevant 10, The condition of the tires would be a factor to 
Control” consider in determining the stopping distance for 
the automobile. 


The description of this problem includes an underlined 
conclusion which the student is asked to judge. The student 
may agree, disagree, or be uncertain about the conclusion. 
In the earlier tests he had been asked to select from a list of 
conclusions the one he thought most appropriately answered 
the question asked in the description of the science situa- 
tion. The use of this form of the problem was adopted in 
order to score the student on his ability to distinguish be- 
hich sufficient information was given to 
sure of his answer, and others 
ain because necessary 
ion of the prob- 


tween problems in w 
enable him to be reasonably 
about which he should remain uncert 
information was not included in the descript 
lem. This form of the problem also enables the teacher to 
discover those students who have become “over-critical,” i.e., 
who challenge problems by choosing the uncertain response 
when, in the judgment of the teachers, these problems are 
so stated that one can either agree or disagree with the 
conclusion. 

An investigation was undertaken to discover what effect 
the changed form of presenting the conclusion might have 
upon the results. It was found that it made little difference 
in which form the conclusion was given. Ninety-one students 
were given a test especially prepared for this investigation 
in which they were asked to select from a list of four con- 
clusions the one that they believed was most appropriate. 
This was followed in the same testing period by a second 
prepared test in which they were asked to make a judgment 
about a single conclusion. Two sample items are given here 
to illustrate how the problems were paired in the two tests. 


TEST I, PROBLEM I 
A motorist had his tires filled to 35 pounds of pressure when the 
temperature was 110° F. The temperature dropped to 80° the 
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next day. What probably happened to the pressure of the air in 
the tires? (Assume that no air is lost from the tires. ) 


( ) A. The pressure would be greater than 35 pounds. 

( ) B. The pressure would be less than 35 pounds. 

( ) C. The pressure would not change. 

( ) D. The pressure may be the same, greater, or less— 

one cannot tell, 

TEST II, PROBLEM I 
A motorist on a trip to the West had his tires checked to 35 
pounds on the edge of Death Valley Desert at about 4:00 р.м. 
That night he stayed at a nearby tourists’ camp where the tem- 
perature always dropped several degrees during the night. In 
order to be sure that the old tires on his car would not blow out 
during the night, he should let some of the air out of the tires. 


( ) Agree ( ) Disagree (. ) Uncertain 


Twenty-two such paired problems were included in the 
two tests. A correlation between the number of right re- 
Sponses made on the two tests was found to be .83. The two 
tests were found to be about equally reliable (.53 and 55). 
The mean of test I was slightly higher (10.91) than the 
mean of test IT (10.02) indicating that it was slightly less 
of the individual students to the 
zo tests were found to be consistent 
in 75 per cent of the cases. From this study it seems likely 
that a score obtained from a test in which the student is 
asked to select a conclusion for a stated problem will be a 
good index of his score on a test in which he is asked to 
judge a given conclusion. Because the student is required 
to do less reading and consequently can react to more prob- 
lems in a given unit of time, the type of problem requiring 
a judgment about a single conclusioí; was adopted. 

The introduction of the "uncertain" response required a 
new list of reasons to be included (reasons 1 to 10). These 


ten reasons enable the student who chooses the uncertain 
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response to explain why he is unable to agree or disagree 
with the conclusion. Most of these reasons are statements of 
additional factors which one might want to know before 
making a decisive judgment about the conclusion. They have 
been called “control” statements in the problems where un- 
certainty is considered the acceptable response to the con- 
clusion, and “irrelevant controls” in those problems where 
either agreement or disagreement with the underlined con- 
clusion is considered the acceptable response. It was also 
recognized that one might be unable to agree or disagree 
with the conclusion because of insufficient knowledge about 
the problem. To provide for this, statements which enable 
the student to say that he is unable to make a decision be- 
cause of lack of knowledge about, or experience with, this 
sort of situation are included in the first ten reasons. 

The student who chooses the uncertain response to the 
problem marks only those of the first ten reasons which he 
selects to explain his uncertainty and then proceeds to the 
next problem. The student who agrees or disagrees with the 
underlined conclusion disregards the first ten reasons and 
selects his supporting statements from reasons 11 to 24. The 
pattern of reasons included for supporting or refuting the 
conclusion is similar to that described for Test 1.3a, with 
two exceptions. These are the inclusion of an “acceptable” 
analogy and an “acceptable” authority statement in each 
problem. 


Continuation of PROBLEM I (p. 95) 


Reasons to be used if you agree or disagree: 
Tele- 11. The increasing difficulty of stopping objects 


ology at higher speeds is a part of nature's plan to 
keep people from driving too fast. 

Wrong ]2. The distance required to bring а car to a 

Principle stop is directly proportional to the speed of 


the cor. (Inconsistent with B) 
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Acceptable 
Practice 


Unacceptable 
Analogy 


Right 
Principle 
Ridicule 


Assuming 
Conclusion 


Wrong 
Principle 


Right 
Principle 


Unacceptable 
Authority 


Right 
Principle 


Wrong 
Principle 


Right 
Principle 


18. 


15. 


16. 


Ef. 


18. 


19. 


21. 


Many drivers have learned from experience 
that the distance required to bring a car to a 
stop is more than doubled when the speed is 
doubled. (Inconsistent with C) 


. Just as the centrifugal force acting on a car 


going around a curve is increased four times 
when the speed is doubled, so will the dis- 
tance required to stop a car be increased four 
times when the speed is doubled. (Incon- 
sistent with C) 

When brakes are applied with constant pres- 
sure there is constant deceleration of the car. 
Any student of physics ought to know that 
the distance required to stop a car when it is 
traveling at 60 miles per hour is more than 
200 feet. (Inconsistent with C) 

It would require more than 200 feet for the 
motorist to bring his car to a stop traveling 
60 m.p.h. (Inconsistent with C) 

As the speed of a car increases, the mechan- 
ical efficiency of the brakes decreases consid- . 
erably. (Inconsistent with B) 

When the speed of a car is doubled, the dis- 
tance required to bring it to rest is increased 
four times. (Inconsistent with C) 


- Automobile mechanics report that cars trav- 


eling at 60 miles per hour cannot be brought 
d stop within 200 feet. (Inconsistent with 
The distance moved while coming to rest by 
an object undergoing constant deceleration 
is proportional to the square of the velocity. 
(Inconsistent with C) 


. When the velocity of a car is doubled, the 


distance required to bring it to a stop may be 
quickly calculated by multiplying the veloc- 
ity by four. (Inconsistent with C) 


. The kinetic energy of a car traveling at 60 


miles per hour is four times that of the same 
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car traveling 30 miles an hour. (Inconsistent 


with C) 
Acceptable 24, Just as the penetrating distance of a bullet is 
Analogy increased four times when its velocity is dou- 


bled, so is the stopping distance of an auto- 
mobile increased four times when its speed is 
doubled. (Inconsistent with C) 


In the earlier forms of the test all analogy statements were 
formulated as unacceptable reasons. In this form two analogy 
statements are used in each problem, one acceptable as a 
reason for supporting the conclusion, the other unaccept- 
able, The inclusion of acceptable analogy statements makes 
it possible to score a student on his ability to distinguish be- 
tween those statements of situations which are closely analo- 
gous to the original problem and those which seem to be 
but actually are not explainable by means of the same under- 
lying principles. The use of authority and practice had also 
been restricted in earlier test forms to the unacceptable use 
of such reasons. Because in life students are often forced 


through exigencies of time and circumstance to use author- 
h able to include in this test two such 


ity, it was thought desir | 
statements in each problem, one of which was judged to 
able. If students then 


be acceptable and the other unaccept 
used such statements in justifying their reaction to the con- 


clusion, one would be able to distinguish those students who 
used authorities discriminatingly from those who either did 
not cite authorities or who were unable to distinguish be- 
tween authorities judged acceptable and those judged `m- 
acceptable. The inclusion of these statements gives students 
an opportunity to reveal whether or not they can distinguish 
between authorities—either persons Or institutions—which, 
because of training, study, experience, etc., should be in a 
position to give reliable information about the problem, and 
those which involve the use of false credentials, or transfer 
of prestige from one field to another, and in reality offer little 
reliable evidence about the problem. 
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Summarization and Interpretation of the 
Scores on Form 1.3b 

The form of the data sheet on which the several scores 
are tabulated and summarized is presented on page 102. A 
description of how these scores are obtained from the test 
results and some of the possible interpretations is also given 
below. Some of the experimental procedures used for arriv- 
ing at this form of summary will also be described. 

An experimental form of Form 1.3b was given to 415 stu- 
dents who were in the eleventh and twelfth grades of two 
(161 juniors and 254 seniors). The 
results were studied in an attempt to discover a convenient 


and meaningful method for reporting achievement. An item 
he responses of students to each item 


analysis or record of t 
in the test was prepared. This was studied to reveal items 
ther because they were too 


which seemed to need revision ei 
iguous, Or for some other 


difficult, because they were amb 
reason did not elicit the expected student response. A score 


indicating the number of student responses on each separate 
kind of item was then put ona tentative data sheet. Twenty- 
Seven scores were used for each student on this original data 
Sheet, and several others were computed from these in an 
effort to find those which gave the most meaning to the 
results, 

The interrelationships of the scores were also studied. 
From these preliminary studies the final form was made and 
given to a new group of 283 students from two schools.in 
the Eight-Year Study. These students included 127 from the 
tenth grade, 166 from the eleventh grade, and 40 from the 
twelfth grade, These results were used for the statistical data 
which will be found in’ Table 4 of Appendix П. 

The final form for reporting scores determined by these 
Means contains 20 scores for cach student. These 20 scores 
Seem to give ‘all of the essential information necessary to 


large public high schools 
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describe the differences in the students’ ability to apply prin- 
ciples in the manner defined and measured by this test. An 
examination of the data sheet (p. 102) will show how these 
scores were finally recorded. 

The scores made by seven students in the eleventh grade 
were selected for purposes of illustration. At the bottom of 
the sheet the maximum possible score, highest score, lowest 
score, and group median is recorded for each column. These 
were computed from the class from which these seven stu- 
dents were selected. Some of the scores represent actual 
number of responses, while others are computed in per cent 
by using certain of the scores from other columns as bases. 

The achievement of the student as revealed by the test 
may be analyzed in terms of five related questions. The first 
of these questions is: To what extent can the student reach 
valid conclusions involving the application of selected prin- 


` ciples of science, which he presumably knows, to new situa- 
tions? 


Columns" Column 1 gives the number of conclusions out of à 
1,2,8 possible eight which the student marked correctly. 
The eight correct responses were distributed among 
agreement with the stated conclusion in three prob- 
lems, disagreement with the stated conclusion in three 
problems, and uncertainty about the stated conclu- 
sion in the remaining two problems. Column 2 (too 
uncertain) gives the number of conclusions which 
the student marked uncertain when the correct re- 
sponse was either "agree" or “disagree.” Column 3 
(too certain ) gives the number of conclusions which 
the student marked either agree or disagree when the 
correct response was “uncertain.” When his scores in 
columns 1, 2, and 3 do not total to eight, either the 
student marked some conclusions agree which should 
have been marked disagree, or he marked some con- 


1 The column numbers used in the follo 


'wing para, 1 to the 
summary sheet (p. 102) on which the scores B Beragraphs: eter to 


are recorded. 


APPRAISING STUDENT PROGRESS 105 


clusions disagree which should have been marked 
agree, or else he omitted some of the conclusions. If 
we denote an interchange of the agree and disagree 
responses by the term “error in fact,” the following 
table may be used to describe the complete scoring of 


the student’s conclusions. 


pem: — === 
| Tn | Agree | Uncertain Disagree 
Pa ee - РО | 

| Agree | Acceptable | Too certain | Error in fact 

| Uncertain | Too uncertain | Acceptable Too uncertain 
| Disagree | Error in fact | Too certain Acceptable 


a sheet student A marked all eight 
of the conclusions in agreement with the key. Student D 
agreed with the key four times, marked two of the conclu- 
sions as uncertain when he should either have agreed or dis- 
agreed with them according to the key. He also marked one 
of the conclusions which was keyed as uncertain as agree 
or disagree. Further he either made an “error in fact” by 
marking an agree conclusion as disagree or à disagree con- 
or he omitted one problem. This is shown 
by the fact that his score on conclusions totals seven rather 
than eight. One would have to examine his paper to deter- 
mine whether he had omitted a problem or made an "error 
in fact," for no score for problems omitted is recorded on 


the data sheet. in s 
The second question is: How does the student explain his 


Uncertainty when he marks the stated conclusion “uncer- 


tain"? 


Thus on the sample dat 


clusion as agree, 
g 


umber of statements which the 


Column give 
s Col 5 gives the n 
итп ә § ither a lack of knowledge 


5,15,16 student used to express © 
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about, or experience with, the situation described in 
the problem. These explain why he marked one or 
more of the stated conclusions “uncertain.” These 
statements are considered neither “right” nor “wrong” 
in scoring the test. Column 15 gives the number of 
statements which express a desire for control (see the 
test items themselves to clarify the intended meaning 
of "Control"). They also are used by the student to 
explain why he marked one or more of the stated 
conclusions “uncertain.” In two of the eight problems 
there is actually a need for further clarification or con- 
trol of certain factors involved in the problems. Col- 
umn 16 gives the number of statements, used by the 
student in these two uncertain problems, describing 
“controls” which are considered to be essential addi- 
tional information necessary for the solution of the 
problem, and hence are valid reasons for marking the 
conclusion uncertain. In the remaining six problems, 
the controls are considered to be unnecessary for the 
solution of the problem. The difference between the 
scores in columns 15 and 16 gives the number of un- 
necessary controls marked by the student. It should be 
borne in mind that a student has an opportunity to 
score in columns 5 and 15 when he marks a conclusion 
“uncertain,” but has an opportunity to score in column 
16 only when he marks the conclusion “uncertain” in 


one of the two problems where the uncertain response 
is regarded as the correct one, 


Student D, as shown in column 5, ased five statements 
which expressed either a lack of knowledge about, or ex- 
perience with, those problems which he marked as uncer- 
tain. Generally speaking, a high score in column 5 will be 
associated with a low score in column 1. The correlation 
between column 1 and column 5 is — 34. The fact that he 


has a score of one in column 3 indicates that he marked one 


of the problems which was keyed as uncertain in agreement 
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with the key, while the score of six in column 16 indicates 
that he must have judged correctly the other uncertain prob- 
lem. His score of two in column 2 would account for the 
seven unacceptable control statements which were used ( dif- 
ference between columns 15 and 16) for in these two prob- 
lems he was attempting to justify an uncertainty through the 
use of “control” statements when according to the key he 
should have either agreed or disagreed with the conclusion. 
In summary, student D marked four of the conclusions in 
agreement with the key. He was too uncertain in two of the 
problems and too certain in one. He either omitted one prob- 
lem or made an “error in fact” by marking an agree conclu- 
sion disagree or a disagree conclusion as agree. He used five 
statements to indicate that he did not understand some of 
the problems where he was uncertain about the conclusion, 
and thirteen statements of “controls,” six of which were con- 
Sidered to be acceptable. 

The third question is: To what extent can the student jus- 
tify logically his agreement with, his uncertainty about, or 
his disagreement with the stated conclusions? 


Columns Column 7 gives the total number of reasons used by 
7, 8,9, the student to explain his decisions about the stated 


7, 28 conclusions (excepting those which express a lack of 
knowledge about, or experience with, the situation 
described in the problem scored in column 5). Stu- 
dents vary a great deal in their comprehensiveness, 
that is, in the extent to which they use a large num- 
ber of reasons to explain their decisions about the 
stated conclusions. The meanings of every subscore 
on reasons for a chosen student must be interpreted 
in the light of the score which he received in column 
7. Column 8 gives the number of correct or acceptable 
reasons used by the student. Column 9 gives the per 
cent.accuracy of the student in supporting his decisions 
about the stated conclusions with acceptable reasons. 
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puted by dividing the score in column 12 by the score 
in column 11 and expressing the result in per cent. 
The scores in column 16 were discussed above in con- 
nection with the second question. Column 19 gives the 
number of "sound" analogies used by the student. The 
difference between the scores in columns 19 and 18 
gives the number of unacceptable or false analogies 
selected by the student. Column 22 gives the number 
of acceptable appeals to authority or common practice 
which the student used in explaining his decisions 
about the stated conclusions. The difference between 
the scores in columns 22 and 21 gives the number of 
unacceptable appeals to authority or common practice 
selected by the student. 

Student C used a total of 34 reasons to justify the eight 
conclusions he selected. Twenty-four of these were restricted 
to principles, of which 23, or 96 per cent, were keyed as ac- 
ceptable. He also used five acceptable analogies, and only 
one statement which was classified as unacceptable because 
it was a ridicule, teleological, or assuming the conclusion 
type of reason. He did not use authority or common practice 
to explain his choice of conclusions. 

In making interpretations of a student's scores, all of his 
Scores on reasons should be judged in relation to his score in 
column 7. Per cent scores should be judged in relation to the 
number of items on which the per cent is based. That is, one 
out of two may have quite a different meaning than 10 out 
of. 20. Reference to the “maximum possible,” the “lowest 
score” and “highest score,” and the group median (all given 
at the bottom of the summary sheet) will provide a frame of 
reference for judging the student with respect to the mem- 
bers of his own class. + 

Statistical data, including the reliability of each score, the 


intercorrelations of various scores, means, and standard de- 
viations for several populations will be for i y 

i und in the Apper 
dix II, Tables 4 and 5. ji 
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If students have been placed in situations in the classroom 
and laboratory where resourcefulness, adaptability, and se- 
lective thinking have been essential for the solution of prob- 
lems, and if the emphasis given to teaching science prin- 
ciples has been upon their applications to the solution of 
problems involving commonly occurring natural phenomena 
rather than on the mastery of science information as an end 
in itself, then students should have little difficulty in behav- 
ing in the manner anticipated by this test. Such students 
would have had many opportunities to apply the principles 
of science as they learned them to a number of situations in 
the laboratorv and classroom, and would have been encour- 
aged to be alert for similar opportunities for application as 
they occur outside the classroom. 

Experience of teachers with this objective seems to indi- 
cate that the objective is not attained through any one par- 
ticular teaching unit. Rather it is the outcome of the way in 
Which emphasis has been given to the objective with all the 
Science materials taught in the classroom and laboratory. 
Consequently, teachers may wish to use from time to time 
during the semester or year classroom exercises which can 
be used for checking on these abilities and giving a tenta- 
live appraisal of progress. À considerable number of such 
exercises, much simpler in form than the tests of Application 
of Principles, have been constructed by classroom teachers 


In summer workshops. 


> o 
ПІ. APPLICATION OF PRINCIPLES OF LocicaL REASONING 


ANALYSIS OF THE OBJECTIVE 


The phrase "logical, reasoning" is currently used to de- 
Scribe a wide variety of behaviors. The whole process of 
thinking about problems in an orderly scientific fashion is 
Sometimes called logical reasoning: In what follows the 
Phrase “logical thinking” will be restricted to mean distin- 
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guishing between conclusions which follow logically from 
given assumptions and conclusions which do not follow log- 
ically from the given assumptions. 

The intended meaning of the term “principles of logical 
reasoning” may be illustrated by means of the following ex- 
amples of such principles: 


A. Definitions: Crucial words and phrases must be precisely de- 
fined, and a changed definition may produce a changed con- 
clusion although the argument from each definition is logical. 

B. Indirect Argument: The validity of an indirect argument de- 
pends upon whether all of the possibilities have been con- 
sidered. If there are three and only three possibilities and 
one of them must happen, then if two of the possibilities are 
shown to be in fact impossible, the third must happen. The 


conditions necessary for the logical use of indirect argument 
are seldom fulfilled in practice. 


- Argumentum ad Hominem: An attack upon certain aspects of 
a person or institution, even though justified, is not sufficient 
to prove the lack of all merit in that person or institution. 


This covers the common use of ridicule, attack on motives, 
etc. 


. If-Then: If one accepts certain premises, then one must ac- 
cept the conclusions which follow from these premises. The 


if-then principle is a necessary part of our method of criticiz- 
ing generalizations, questioning assumptions, etc. 


The belief that the study of certain secondary school sub- 
jects develops a faculty for logical reasoning is no longer 
considered tenable. It is, however, quite different to claim 
that properly guided contact with the subject matter of the 
secondary school curriculum may provide experiences which 
will promote logical thinking in dealing with life situations 
Many secondary school teachers are endeavoring to have 
their students recognize patterns for logical thinking in the 
organization of certain bodies of subject matter. Sometimes 
the teachers make a conscious effort to have their students 
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apply these patterns for thinking to problems which arise 
in connection with their daily experiences. It is found that 
principles of logical thinking may be stated and applied to 
widely different kinds of situations. In the light of the fore- 
jective under consideration may be 
stated in general terms as follows: Students in secondary 
schools should acquire the ability and the disposition to 
apply principles of logical reasoning in dealing with their 
everyday experiences. 
Several more specific beha 
to characterize progress towar! 
jective are listed below: 
o examine the logical structure of the argu- 
of logical reasoning in the 


going explanation, the ob 


viors which might be chosen 
d the achievement of the ob- 


a. Disposition t 
ments and to apply principles 


study of these arguments. | | 
b. Ability to distinguish between conclusions which do and 


ones which do not follow logically from a given set of 
assumptions. " à 

c. Ability to isolate the significant elements in the logical 
structure of an argument as shown by distinguishing be- 
tween statements of ideas which are relevant and state- 
ments of ideas which are irrelevant for explaining why a 
conclusion follows logically from given assumptions M 

d. Ability to recognize the application of a logical principle, 
whether stated in general terms or specifically referred to 
the situation in question, to explain why a conclusion fol- 


lows logically from given assumptions. 
ective tests to measure the dispesi- 
logical principles in dealing with 


н No effort to prepare obj 
ion of students to apply 
Aents арр vas made by the Evaluation 


their everyday experiences v 
Staff. A test devised for this purpose would present serious 
problems of validation. The difficulties attendant upon the 
Construction and administration of such a test would prob- 
ably be greater than the difficulties of observing the stu- 
dents directly. Hence the efforts to measure behaviors re- 
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lated to the objective have been directed toward measuring 
the abilities connected with applying logical principles 
rather than toward measuring the disposition to apply log- 
ical principles. 

The following discussion deals with the evaluation of the 
ability to judge the logical structure of arguments presented 
in written form. This ability will have much in common with 
the ability to judge the logical structure of arguments pre- 
sented verbally, pictorially, or otherwise. Some students will 
have occasion in later life to write essays, prepare speeches, 
and the like. For these students an emphasis upon the pro- 
ducer aspect of applying logical principles is easily justified. 
Almost all students, however, will read editorials and adver- 
tisements, listen to political speeches, and the like. Hence 
this consumer aspect of applying logical principles (for ex- 
ample, taking note of the need for definition of terms) may 
be considered an objective of general education. 


THE DEVELOPMENT OF EVALUATION INSTRUMENTS 

Preliminary investigations 

The first step toward the construction of a test for this ob- 
jective was the preparation of a list of logical principles 
which secondary school students might be expected to apply. 
A few principles were found explicitly stated in secondary 
school textbooks (particularly of geometry) and the list was 
extended by reference to books on logic. From this list the 
four stated above were selected. Teachers of mathematics 
were particularly concerned with the objective, and their in- 
terests largely determined the choice which was made. The 
principles stated relative to definitions, indirect (or reduc- 
tio ad absurdum) argument, and “if-then” reasoning play an 
important role in the teaching of geometry. The fallacv of 
argumentum. ad hominem was included because the claim 
has so frequently been made that the study of geornetry, 
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which as usually taught offers little opportunity for this sort 
of error, provides a standard of comparison for reasoning in 
other situations. Consequently if the acquaintance with this 

* standard is functional, it should enable the student to recog- 
nize the fallacy. 

The second step toward the construction of a test consisted 
of a search of current newspapers, magazines, and legal case- 
books for suitable reasoning situations. These sources were 
chosen because of the emphasis being given in several of the 
schools upon reasoning in life situations. The legal cases 
which formed the basis of several test problems were typical 
of those reported almost daily by the press, but were be- 
lieved to be of greater interest to students. 


Construction of Early Short-Answer Forms 

The first test which was constructed (Form 5.1) described 
12 different reasoning situations or problems'? each followed 

Y several possible conclusions. The student was asked to 
Select one of the conclusions and to defend it by selecting 
reasons from a list which followed. Each logical principle 
Could be correctly used to defend a conclusion in three dif- 
ferent problems. Included in each list of reasons were state- 
ments of several of the principles listed above, and additional 
Statements which were irrelevant or otherwise unsatisfactory 
as reasons, The occurrence of several of the principles in 
€ach list. of reasons required the student to discriminate 
àmong them even if the relatively abstract form of state- 
ment helped him to identify them. 

In order to discover what sort of statements other than 
Principles should be included among the reasons; a form was 
Prepared which contained only the situations and the sev- 
eral alternative conclusions. Four classes of tenth and elev- 
enth grade students took this test and wrote out their rea- 
Sons in essav. form. Many of the reasons ultimately used in 


, 119. 
75 For a similar problem taken from a later form, see p. 1 


2 
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the short-answer form were taken with practically no changes 
from student papers. This preliminary investigation also 
served to suggest revisions in the statement of the situations 
and the conclusions. A 

The scoring plan finally adopted for Form 5.1 allowed two 
points for each correct conclusion, one point for each correct 
reason, and deducted one point for each incorrect reason. 
A score was given indicating achievement relative to each 
principle separately, and also a total score. 

The next form (5.11) of the Application of Principles of 
Logical Reasoning test incorporated several changes. It was 
noted that the statements of logical principles in Form 5.1 
were of two kinds. Some of the statements referred directly 
to the situation under consideration and others were general | 
statements of logical principles. A pattern of statements was 
built into Form 5.11 with a view to securing separate scores 
on ability to recognize the application of principles which 
were stated specifically and principles which were stated 
generally. In each problem there were four specific state- 
ments of principles, four general statements of principles, 
and two extraneous statements including in the test as a 
whole statements of personal opinion, prejudice, reliance 
upon authority, and the like. Of the four specific and four 
general statements in each problem, one of the specific and 
one of the general statements were relevant in the sense that 
they explained why the correct conclusion followed logically 
from the given assumptions. In a sense the cards were stacked 
against the student by providing three opportunities to use 
an irrelevant statement of a principle and one opportunity to 
use an extraneous statement for each opportunity to use à 
relevant statement of a principle. 

The four principles (definition of terms, indirect argu- 
ment, argumentum ad hominem, and if-then) tested in Form 
5.1 were again tested in Form 5.11. 


In addition, a principle 
relative to sampling ( 


A sample does not necessarily repre- 
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sent the population from which it was drawn”) was included 
in Form 5.11. Three problems on each principle were given, 
or 15 problems in all. 

When the test results were summarized, an attempt was 
made to score the number of correct conclusions (out of a 
possible three) on each principle and the number of correct 
(out of a possible six) and incorrect (out of a possible 
eighteen) uses of statements of each of the given principles. 
These scores were found to be too unreliable to be useful 
in practice. Moreover, the attempt to summarize separately 
the right and wrong uses of specific and of general state- 
ments of principles did not yield results of practical signifi- 
cance. Tt was found that the scores on specific statements 
were highly correlated with the scores on general statements. 

In the final analysis the scoring of Form 5.11 yielded six 
useful scores, These were scores on numbers of right and 
Wrong conclusions, right and wrong total reasons, extrane- 


005 reasons, and general accuracy. The general accuracy 
as twice the total number of right re- 


and reasons) minus the total number 
usions and reasons). This score 
each of the other scores, and a 
btained for a population of 


Score was computed 
Sponses (conclusions 
ОЁ wrong responses (concl 
was highly correlated with 
reliability coefficient of .94 was o 


216 students. 
A consideration of the desirable improvements to be made 


in revising this test led to several suggestions. Form 5.11 was 
a long test and was made inefficient by the large proportion 
ОҒ wrong statements. The student who responded correctly 
to the test problems made an explicit response to only one 
Statement in five. The assumption that by refraining from 
marking a statement a student was making an explicit re- 
Sponse (e.g., “the statement is irrelevant”) was not thought 
to be tenable. Thus the student who refrained from marking 
a statement niight have done so because he did not under- 
Stand the statement or did not take time to consider it fully 
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Hence a tentative revision of Form 5.11 was made and given 
to 60 students. In this form, 5.11a, the students were asked 
to respond to every statement and to decide whether it was 
(1) specific and relevant, (2) specific and irrelevant, (3) 
general and relevant, (4) general and irrelevant. This at- 
tempt to get at possible differences in the ability of students 
to deal with specific and general statements of logical prin- 
ciples was again not successful. No very meaningful inter- 
pretations of difference between the ability to deal with 
specific and the ability to deal with general statements could 
be made. However, when scored in terms of relevance alone, 
for example, total number of irrelevant statements classified 
under (2) or (4) above, Form 5.11a yielded very promising 
results. With only eight problems based on four principles, 
it was possible to secure a number of diagnostic scores in- 
cluding scores on each of the principles separately. For this 
latter purpose the method formerly used for scoring the 
separate principles on Form 5.11 was changed. Rather than 
counting the number of correct and incorrect uses of each 
principle throughout the test, the plan was now adopted of 
scoring two intact problems both directed at thc definition 
principle to secure a score on accuracy with definition, and 
similarly with the other principles. This plan was later used 
in summarizing the results on the final test, Form 5.19. The 
scoring of this test will be discussed in some detail in what 
follows. 
Sti'ucture of the Application of 
Principles of Logical Reasoning Test, Form 5.12 

It has been found that Form 5.12 of the Logical Reason- 
ing test provides a better analysis of the students’ abilities 
in relation to the objective than did previous forms. More- 
over, with the exception of the orginal Form 5.1, this form 
is considerably shorter than previous forms and somewhat 
simpler from the standpoint of the directions to the student- 
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A study of the following explanation of the structure of the 
test problems in comparison with the sample test problem 
presented below will serve to clarify the objective further 
and to indicate the extent to which it is measured by the 
test. A list of the responses accepted as correct by a jury of 
competent persons (i.e., a test key) is given in the margin. 


Problem IV 

In January, 1940, Commissioner K. M. Landis submitted a plan 
to give financial aid to minor league baseball teams to restore 
fair competition by preventing certain major league teams from 
controlling the supply of players. Several leaders in the baseball 
world objected to this plan; some declared that Landis should 
enforce the rules governing the operation of baseball teams, but 
should not make interpretations which would change the in- 
tended meaning of the rules set up by the proper committees. 


Larry MacPhail, president of the Brooklyn Dodgers, speaking at 
a dinner in Boston, expressed grave concern over the situation. 
The following statements are quoted from his remarks: In the 
Matter of Landis versus the present system, he sits as prosecutor, 
judge, and jury, and there is no appeal. If baseball is to ре Зиг 
mated by any selfish group, it wont be long before professiona 
football or some other sport will replace baseball as the great 


^ 
national game, and none of us want that. 


Directions. Examine the conclusions given below. If by “us Mr. 

MacPhail means all persons at the dinner, and if they accept his 

remarks as true, which one of the conclusions do you think is 

lustified? | 

Conclusi 

Los at the dinner will conclude that they do 

1 to be dominated by a selfish group. 

B. Logical persons: at the dinner will conclude that, if ur 
domination of baseball by a selfish group id npo E 
baseball will not be replaced as the great See I 

C. It is impossible to say what a logical person at the dinner 


will conclude. 


A. Logical persons 
not want basebal 
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A: Statements which explain why your conclu- 
sion is logical. 


Mark in column В: Statements which do not explain why your 


conclusion is logical. 
C: Statements about which you are unable to 


decide. 


Statements 


А 1. 


В 2 
B 3 
А 4 
А 5 
B 6 
B 7 
В 8 
B 9 
A 10. 


Since we assumed that Mr. MacPhail referred to all per- 
sons present at the dinner when he said “none of us,” and 
that those present accepted his statements as true, the 
conclusion which we reached follows logically. 

Logical persons at the dinner may agree or disagree with 
Mr. MacPhail. 

Without knowing the assumptions of logical persons, we 
cannot predict their conclusions. 

If no person at the dinner wants professional football or 
some other game to replace baseball as the great national 
game, then the logical ones cannot want baseball to be 
dominated by a selfish group. 

If we accept the assumptions on which an argument is 
based, then, to be logical, we must accept the conclu- 
sions which follow from them. 

Sometimes the meaning of a word or phrase used in an 
argument must be carefully defined before any logical 
conclusion can be reached. 

A changed definition may lead to a changed conclusion 
even though the argument from each definition is logical. 
If the domination of baseball by a selfish group results 
in some other sport replacing baseball, then, if such 
selfish domination is prevented, baseball will not be re- 
placed. 

Mr. MacPhail considered every possibility—either base- 
ball will or will not be replaced as the great national 
game—and thus made a sound indirect argument. 

If a conclusion follows logically from certain assump- 


tions, then one must accept the conclusion or reject the 
assumptions. 


a lh 
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B ll. If one removes the fundamental cause for other games 
replacing baseball, baseball will not be replaced as the 


great national game. 
B 12. The soundness of an indirect argument depends upon 
whether all of the possibilities have been considered. 


In each problem the student is given a paragraph, three 
conclusions, and twelve statements. He is directed to read 
the paragraph carefully and to choose the one of the three 
Conclusions which he thinks is justified by the paragraph. 
In the test as a whole the student judges the logical ap- 
Propriateness of the conclusions drawn in eight different 
Situations. In two of these the definition principle operates; 
in two others the indirect argument principle operates; in 
two others the areumentum ad hominem principle operates; 
and in the аа оње two the if-then principle operates. It 
should be noted that the number of possible correct conclu- 


Sions is small especially if considered with respect to the 


Opportunity to use the correct principles separately. Conse- 


quently the major emphasis is placed upon the students re- 
actions to the statements which follow the conclusions in 


each test problem. 


_ The statements offered to th 
Including: 


e students are of several kinds, 


à. General or abstract statements of the logical principle 
involved in that particular test eet -— 
b. Specific statements of the logical principle involve 


in the particalar test situation. . 2.2 
с. lese or specific statements of logical principles 


not pertinent to the particular test situation, Ven 
ments which appeal to authority, 5 шы por- 
sonal opinion, ^or statements which аге otherwise 


irrelevant. 
The student is directed to mark each st 
Tee Ways according as it is: 


atement in one of 
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a. Relevant for explaining why his conclusion is logical. 
b. Irrelevant for explaining why his conclusion is log- 


ical. 
c. Not sufficiently meaningful to him to permit a deci- 
sion. 


In the test as a whole the student judges the relevance of 
96 statements, and is given the opportunity to reveal his lack 
of understanding of any of these statements. The variety of 
the statements including specific and general statements of 
the principles, statements of authority, personal opinion, 
prejudice and the like provides an opportunity to make 
many of the common logical errors. The sample of state- 
ments in the test includes 36 relevant and 60 irrelevant state- 
ments. Of the 36 relevant statements, 16 are specific and 20 
are general. Of the 60 irrelevant statements, 20 are general 
statements of the four principles of the test, 19 are specific 
statements of these principles, and 21 are specific statements 
of the other kinds mentioned above. 


Summarization and Interpretation of the Scores on Form 5.12 
During the experimental stages of Form 5.12, the test re- 
sults for a sample population of 351 students were studied 
intensively in an attempt to discover the most convenient 
and most meaningful form for reporting the results. An item 
analysis or record of the responses of all students to each 
item on the test was prepared. The individual student papers 
were scored by entering the number of responses of each 
separate kind on a tentative data sheet. Fourteen scores were 
summarized for each student, and more than eight additional 
scores were considered during the study. Certain important 
scores were selected and studied with reference to the item 
analysis in an effort to see more clearly the relationships þe- 
tween each of these scores and the responses of students tO 
individual test items. , | 


The 351 students comprised 12 separate classes in fout 
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public schools. Certain facts about the backgrounds of these 
different classes were known. The responses of each class to 
the individual test items (taken from the item analysis), and 
the median scores of each class (taken from the data sheets ), 
were studied in an attempt to discover the degree of agree- 
ment or disagreement of these results with the known facts 
about the various classes of students. The results of this study 
indicated that the students who secured good total scores 
were also the students who did well with the individual test 
items. Moreover, it was found that the classes which had had 
most contact in school with the logical reasoning objective 
tended to secure the highest scores on the test. 

Certain correlation coefficients between the scores which 
had been summarized were computed. It was found possible 
to reduce the number of scores on the data sheet to 11 with- 
out ап appreciable loss of information. It was again found 
that separating the responses to specific statements of prin- 
ciples from the responses to general statements of principles 
did not yield results of practical significance. Several at- 
tempts were made to secure a general accuracy score which 
would serve as a good over-all index of behaviors involved 
in the application ‘of principles of logical reasoning. For ex- 
ample, the total number of correct responses to statements 
on the test, and twice the number of relevant statements 
recognized as such, less the number of irrelevant statements 
judged to be relevant, were tried. It was found that all of 
these indices were highly correlated with one or none of 
the simpler scores obtained by counting the numbers of re- 
Sponses of a certain kind, and that the indices were no more 
Yeliable than the simpler scores. Hence no score in general 
Accuracy was retained. Because the number of irrelevant 
Statements on the test is larger than the number of relevant 
Statements (60 as compared with 36), the score on irrele- 
Vant statemerits recognized as such is more reliable than the 
Score on relevant statements recognized as such (.88 as com- 
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pared with .72). The correlation studies indicated that if a 
single index for the abilities measured by this test is desired, 
the score on the number of irrelevant statements judged to 
be irrelevant is perhaps the best such index among the 11 
Scores summarized on the data sheet which was finally 
adopted.'? 

Scores on this test may be interpreted in terms of the an- 
swers to the following three questions: 


l. To what extent can the pupil reach logical conclu- 
sions in situations which may involve his attitudes 
and prejudices? 

2. To what extent can the pupil justify his conclusions 
in terms of certain principles of logical reasoning? 

3. How well can the pupil apply each of the four prin- 
ciples of logical reasoning? 


By study of the various scores reported on the data sheet, 
the teacher may obtain evidence relative to each of these 
questions. Different patterns of behavior analogous to those 
described for the test on Interpretation of Data are identi- 
fiable in terms of the relation of the separate scores to the 
group averages. 


VALDITY AND RELIABILITY OF Form 5.12 


The construction of Form 5.12 of the Logical Reasoning 
test was undertaken in the light of two kinds of previous 
experience. The previous forms of the test had been given 
to selected groups of students and the test results carefully 
studied. The criticisms of certain teachers who were endeav- 
oring to promote the logical reasoning objective were avail- 
able. Sometimes these teachers based their criticisms upon 


19 This data sheet is similar to th 
Interpretation of Data and Applica 
sample copy and detailed descriptio 
this test the reader is referred to th 
sive Education Association. 


ose presented above for the tests on 
tion of Principles of Science. For а 
n of the interpretation of scores from 
e manual, obtainable from the Progres- 
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their experiences in administering the tests and interpreting 
the test results. Sometimes these teachers had met in groups 
for the purpose of studying and criticizing the tests. Both 
the studies of test results and the suggestions made by teach- 
ers as individuals or as discussion groups helped the test 
makers with the construction of test Form 5.12. In par- 
ticular, the problem situations were chosen with regard for 
the interests of secondary school students. Most of the prob- 
lem situations in this test form are taken directly from state- 
ments found in the feature articles and in the editorial pages 
in newspapers. These quotations were edited to some extent 
to avoid the introduction of extraneous factors such as un- 
Necessary vocabulary difficulties, lack of clear antecedents 
for pronouns, and the like. The statements regarding the 
logical structure of the paragraphs which set forth the prob- 
lem situations were carefully chosen in an effort to make 
them typical of the kinds of statements which students com- 
monly make when they are discussing the logical structure 
of such paragraphs. Several readers went over each test 
Problem carefully in an attempt to discover loopholes in its 
logical structure. Although it is probably quite impossible 
to construct a lifelike argument to illustrate just one prin- 
Ciple of logical reasoning, and express this argument with- 
Cut ambiguity in words, an effort was made to approach 
this ideal in the test situations included in Form 5.12 of the 
gical reasoning test. 

The studies upon which the scoring of Form 5.12 of the 
-0ріса] Reasoning test was based were described above. It 
15 important to note that even a carefully constructed test, 
which actually provides opportunities for the behaviors in 
terms of which the objective is defined, may become invalid 
if the system of scoring adopted does not yield scores which 
Present a true picture of the significant behaviors called forth 

Y the test. Hence it should be noted that careful attention 
Was given to the mode of scoring of Form 5.12 of the logical 
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reasoning test. When conditions of administration are ap- 
propriate and when the results are interpreted by a person 
who is familiar with the objective and the structure of the 
test, Form 5.12 provides a measure of a range of significant 
behaviors related to the logical reasoning objective. 

For the purpose of statistical analysis, the scores of 331 
students, of whom 292 were finishing grade ten, 28 were in 
grade eleven, and 81 in grade twelve, were used. These 
students were all attending public high schools when tested 
and composed nine classes in grade ten, one class in grade 
eleven, and one class in grade twelve. The statistical data 
presented in Appendix II on reliability, intercorrelations of 
scores, and so forth, Table 6, are based upon a study of 
these 851 students. Within certain definite limitations these 
data would apply to other groups of students in the tenth, 
eleventh, and twelfth grades. 

The statistical constants presented will provide enough 
basic information to enable the teacher trained in statistics 
to study the significance of changes in the mean scores of 
a class or in the scores of an individual student. 

Form 5.12 of the test on the Application of Certain Prin- 
ciples of Logical Reasoning is recommended only for classes 
where conscious attention has been directed toward logical 
reasoning. Otherwise, the students are apt to wonder why 
they should attempt to reach logical conclusions which are 
sometimes contrary to their “better judgments.” The judg- 


ment of the teacher as to the readiness of his class for prob- 


lems of the type included in the test is for this reason very 
important. 


IV. Tue Nature oF Proor 


ANALYSIS OF THE OBJECTIVE 


In the past, teachers of sev 


eral of the subject fields in 
the secondary 


school curriculum have been concerned with 


* 


APPRAISING STUDENT PROGRESS 127 


particular aspects of “proof.” For example, one of the objec- 
tives for courses in demonstrative geometry is to develop an 
understanding of the meaning of proof, and students in such 
courses have been expected to learn to prove theorems of 
geometry. Teachers of courses in which oral and written ex- 
pression is emphasized have also been concerned with cer- 
tain aspects of proof. Logical organization has been sought 
in themes and speeches. Courses in science have relied heav- 
ily upon laboratory experiments to "prove" certain laws, and 
Students have been expected to learn to cite experimental 
evidence for their conclusions. Similarly, teachers of other 
Subject-matter fields have objectives related to the concept 
of proof, in cach case with connotations rather specific to 
their own field. The following paragraphs present a gener- 
alized definition of an objective which has come to be called 
"the nature of proof." 

Both children and adults in our society are constantly bom- 
barded with “proofs”; i.e. by arguments designed to con- 
Vince them that they should act in certain ways or should 
believe in certain things. The whole field of advertising 
directs its efforts toward convincing people to act in cer- 
tain ways. Children of elementary school age are persuaded 
Y a radio announcer to ask mother to buy a certain brand 
ОЁ breakfast food. Newspapers and magazines contain car- 
toons which set forth the dramatic stories of lives set right 
by buying and using a different brand of soap. The editorial 
Pages encourage readers to adopt one of several possible 
Courses of action. Even the news articles in the daily papers 
are likely to reflect the policy and convictions of the man- 
agement, and hence may be said to be one of the kinds of 
Proofs” with which people are bombarded. The books and 
Magazines they read, the plays and movies they see, the lec- 
tures and radio talks they hear, and the conversations they 
ave with their associates, all play a part in forming the 
Convictions upon which the actions of people are based. 


т 
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In particular, students in secondary schools react to the 
proofs which they meet in their daily experiences. Author- 
ities on the secondary school curriculum and classroom 
teachers have expressed concern with the problem of guid- 
ing the reactions of the students to these proofs. This concern 
has led many teachers to attempt to have students be- 
come critical of proofs and to have students acquire the abil- 
ities needed for analyzing proofs. It would be ineffective to 
have students become critical of the proofs which they en- 
counter unless the students also acquired some of the abil- 
ities needed in analyzing proofs. On the other hand, the 
ability to analyze proofs is not likely to function unless there 
is a disposition to analyze proofs when the need for such 
analysis arises. Hence the nature of proof objective should 
include the ability to judge proofs, and also the disposition 
to apply this ability on appropriate occasions. 

Tt should be noted explicitly that any of the physical 
senses may be the medium for arriving at proofs. Touch, 
taste, or smell may be the basis for simple proofs. The ques- 
tion, "Are the potatoes salty?” is easily answered; the method 
is to taste them. Sometimes visual impressions also provide 
simple and direct proofs, but often these impressions involve 
more subtle factors. The hand may be quicker than the eye; 
the story told by the moving picture may create certain im- 
pressions which lead up to an intended conclusion through 
a series of inferences. Verbal presentations such as speeches 
and debates are also common vehicles for proof. The writ- 
teu “proofs” which are so frequently met in daily life have 
much in common with proofs in the other forms. It is with 
arguments or proofs presented in written form that we shall 
be chiefly concerned in this chapter. 

One of the important characteristics of proofs should be 
noted immediately. Some proofs proceed mostly from stated 
opinions or convictions. Other proofs are based in part upon 
data derived from experiments or investigations. Both of 
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^ 
these kinds of proofs will involve certain basic assumptions 
which may be more or less tenable. Whatever the subject 
matter with which a proof deals, and whatever the form of 
presentation in which the proof appears, the location and 
appraisal of the basic assumptions upon which the sound- 
ness of the proof depends becomes a fundamental ability 
connected with analyzing proofs. 

In the light of the preceding remarks, some of the be- 
haviors which might be chosen to characterize progress to- 


ward 
listed 


a. 


b. 


с. 


d. 


· Recognition of the possible ways 
* Willingness to accept or reject 


* Recognition that new evidence u 


the achievement of the nature of proof objective are 
below: 


Disposition to analyze proofs critically. 
Ability to recognize the basic assumptions upon which a 
) 


conclusion depends, and to see the logical relationships 


between these assumptions and the conclusion. 
Recognition of the need for further data to confirm, qual- 


ify, or negate the available evidence. 

Ability to distinguish between assumptions whose ten- 
cked by collecting further data and 
ty could not be checked in 
latter sort are 


and definitions * 


übility could be che 

assumptions whose tenabili 

this way. Examples of assumptions of the 
б, 


value judgments, statements of preference, 


of terms. 
for studying a problem 


further, and ability to distinguish between fruitful and 


unfruitful methods of further study. . 
assumptions tentatively, 


and to test the conclusions which follow from these às- 
sumptions by acting upon them. 
: : a pon the soundness of 


One or more of the assumptions may make it desirable 
to reconsider the argument and perhaps to qualify the 


conclusion tentatively reached. 


The efforts of the Evaluation Staff to measure. behaviors 
relative to the Nature of Proof objective were directed to- 
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ward measuring the abilities connected with analyzing writ- 
ten arguments rather than toward the disposition to analyze 
arguments critically. Even when the problem was reduced 
to measuring the skills involved in the critical analysis of 
arguments, it was found to be an extremely complex prob- 
lem. Groups of teachers in the Eight-Year Study were en- 
thusiastic in their approval of the objective, and they sug- 
gested many behaviors which seemed to them significant. 
The task of clarification and simplification was much greater 
than was originally anticipated. The early forms of the test 
used experimentally in an attempt to secure insight into the 
nature of proof objective were too complicated for prac- 
tical purposes. The persons who worked on this problem 
were, however, convinced that the objective is very sig- 
nificant for general education at the secondary level and 
that a continuing effort to overcome the obstacles set up by 
its complexity is worthwhile. 


THE DEVELOPMENT or EVALUATION INSTRUMENTS 


The first nature of proof tests which were constructed pre- 
sented the student with a described situation which presum- 
ably led to a conclusion, and he was asked to write down 
the assumptions which seemed to him to underlie the argu- 
ment." An analysis of the responses indicated that for the 
most part they could be classified into a few types. For ex- 
ample, a uniqueness assumption is often needed to clinch 
an argument—an assumption which states that a product 
advertised, or a chemical used in an experiment, etc., is the 
only one which has a given property. 

The student responses and the results of the analysis were 
utilized in the construction of a short-answer form. A list of 
statements relative to a problem situation was given, includ- 

°° Cf. H. P. Fawcett, The Nature of Proof, Thirteenth. Yearbook of the 


National Council of Teachers of Mathematics (New b- 
Чаш \ atics (New York, Bureau of Pu 
lications, Teachers College, Columbia University, 1938). Appendix, Batt L 
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ing some which purported to represent facts and others 
which were assumptions. Students were asked to distinguish 
facts from assumptions, to reconstruct the argument by using 
statements from the list, and to indicate whether they would 
accept or reject the conclusion of the reconstructed argu- 
ment. 

The results from the first short-answer form threw a good 
deal of light on the thinking of the students. Difficulties fre- 
quently arose, however, with respect to the use of the terms 
“fact” and “assumption,” and the first part of the test did 
not discriminate well among students. The scoring of the 
also caused difficulty. The test was 
therefore revised several times, but limitations of space pre- 
vent a discussion of the resulting experience. Only the forms 
which the test had taken toward the end of the Study can 


be described here. 
Form 5.21 of the N 


eral major changes. An attempt 
dents locate the basic assumptions underlying the argument, 


but the term assumptions was not used in the directions to 
the student. In each problem a paragraph which presum- 
ably justified a conclusion stated at the close of the para- 
Staph was presented. There followed a list of statements. 

ome of these statements were relevant, in the sense that 
they described assumptions underlying the argument, and 
Some of them were irrelevant. The students were asked to 
Dick out the relevant statements and to decide which ‘of 
t ese might logically be used to support the stated conclu- 
Sion, In this way the students were given an opportunity 
to locate basic assumptions, and to recognize the function 
of these assumptions in an argument, although the word as- 
Sumption was not used in the test directions. 

One of the problems taken from Form 3.21 of the Nature 
of Proof test is given below. The directions, in a shortened 


reconstructed arguments 


ature of Proof test incorporated sev- 
was made to have the stu- 
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form, are presented along with the problem.?' A list of re- 
sponses accepted as correct in scoring the test, i.e., a test 
key, is given in the margin. It should be noted that the test 
key adopted by a committee of competent persons before 
the test was given to students was changed to some extent 
when the test results for a sample group of students were 
studied. It became apparent that the “C” step in the direc- 
tions was interpreted differently by the students than by 
the committee. There were also apparent differences in the 
interpretation given to the “C” step by students. It should 
also be noted that there was no decision as to a “correct 
response to the conclusion. 


Read the problem and then: 


A. Select the statements which either support or contradict the 
underlined conclusion. 

B. Select the statements marked under A which support the 
underlined conclusion. 

C. Select the statements marked under B which you do not con- 
sider satisfactorily established by whatever general knowl- 
edge you may have, but which must be included in the 
argument if the conclusion is to be completely justified. 


Conclusion. According to what seems most consistent with your 
analysis thus far, decide whether you: 


A. Are inclined to B. Are very uncer- C. Are inclined to 
accept the con- tain about the reject the con- 
clusion. conclusion. clusion. 


Reasons. Select the statements marked under C which might 
cause you to reconsider your decision about the under- 
lined conclusion if more information were made avail- 
able to you. Mark these under D, 


21 The use of A, B, C, D in the directions below is clarified by the com- 
plete directions, by the form of the special answer sheet on which the 
student makes his responses, and also by a sample exercise explained in the 
general directions. In the marginal keys below, these letters refer to the 
columns in which a statement should be marked on the answer sheet. 
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PROBLEM IX 

In a radio broadcast the following story was told: “The people 
in a little mining town in Pennsylvania get all their water with- 
out purification from a clear, swift-running mountain stream. 
In a cabin on the bank of the stream about a half a mile above 
the town a worker was very sick with typhoid fever during the 
first part of December. During his illness his waste materials 
were thrown on the snow. About the middle of March the snow 
Melted rapidly and ran into the stream. Approximately two weeks 
later typhoid fever struck the town. Many of the people became 
Sick and 114 died.” The speaker then said that this story showed 
how the sickness of this man caused widespread illness, and the 


death of over one hundred people. 


Statements: 


ABCD 1, Typhoid fever organisms can survive for at least 
three months at temperatures near the freezing 


point. 


Trrele- — 9, Good doctors should be available when an epi- 

У; . 
ADC demic hits a small town. ^ 
CD 8. Typhoid fever germs are active after being carrie 
clear, swift-running water. 


for about half a mile in 
А 4. There may have been other sources of contamina- 


tion by waste materials containing typhoid fever 
germs along the stream or at some other point in 
the water supply of the town. 


5. The waste materials of a person who has a severe 
i i i i rgan- 
case of typhoid fever contain active typhoid organ 


isms. | | 
= 6. Typhoid fever is contracted by taking the typhoid 

organisms into the body by way of the mouth. 
Tree. 7. Only a few people in this town had developed an 


бы i i hoid fever. 
immunity to typhotc ' | 
8. Typhoid organisms are usually killed if subjected 


to temperature near the freezing point for a period 


of several months. 


134 ADVENTURE IN AMERICAN EDUCATION 


Trrele- 9. Sickness and death usually result in a great eco- 

vant nomic loss to a small town. 

ABCD 10. The only typhoid organisms with which the peo- 
ple in the town came in contact were in the water 


supply. 
Irele- 11. Vaccination should be compulsory in communities 
vant which have no means of purifying their water 
supply. 


ABCD 12. The workers waste materials were the only source 
of contamination along the stream. 

A 18. There may have been other sources of typhoid 
fever germs in the town such as milk or food con- 
taminated by some other person. 

AB 14. The symptoms of typhoid fever usually appear 
about two weeks after contact with typhoid germs. 


Several further comments on the structure of this sample 
problem might be added to those made above. When the 
student has chosen the statements which he thinks support 
the stated conclusion, he is asked to decide which of these 
are essential assumptions whose truth he would question. 
On the basis of his analysis of the problem, the student is 
then asked to indicate the degree of his acceptance of the 
stated conclusion. Finally the student is asked to decide 
which of the essential assumptions might, in the light of 
further evidence, make it necessary to reconsider his deci- 
sion about the stated conclusion. 

The relationship between the activities which the students 
were directed to perform and the definition of the objective 
in terms of behavior will be apparent to the reader. Under 
ideal conditions the activities which the student performs 
might be expected to yield evidence on the students’ ability 
to recognize the basic assumptions in an argument, the 
standard of proof which the student demands, the student's 
recognition of the tentative nature of the conclusions which 
are based upon arguments, and the role of reexamining the 
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underlying assumptions in order to qualify the conclusions 
which one reaches. In practice, the results do not yield valid 
evidence on achievement relative to all of these behaviors. 
For example, students vary a great deal in the number of 
Statements which they recognize as supporting the stated 
conclusions. This makes the number of opportunities to chal- 
lenge assumptions different for different students. A still 
more serious consideration is the possibility for variation in 
the interpretation of the test directions from student to stu- 
dent. Such variation was noted particularly in connection 
with the directions for challenging the truth of the state- 
ments which had been marked as supporting the stated con- 
clusions. Moreover, the fact that the various activities which 
the students are requested to carry out are interrelated, so 
that failure to perform one step seriously interferes with per- 
forming the next step, presents а difficulty in interpreting 
the results, In this connection the number and complexity 
of the related activities which the students were asked to 
Carry through proved discouraging to many students. 

In the next section a description of the structure of Form 
5.22 of the Nature of Proof test in which the attempt is made 


to avoid some of these difficulties, is presented. 


2.22 


Structure of the Nature of Proof Test, Form 5.22 

d Form 5.22 has involved an attempt 
dures which students are asked to 
ns for carrying out these proce- 
has been an attempt to retain 
monly associated with 


The progress towar 
simplify both the proce 
Carry out and the directio 
ү Tes. At the same time there 

any of the aspects of thinking com 
Problem.solving and scientific method. 

A study of the following explanation of the structure of 
Пе test problems in comparison with the sample test prob- 
em presented below will serve {0 clarify the reasons for the 
clusion of each part of the test. A list of the responses ac- 
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cepted as correct by a jury of competent persons, i.e., a test 
key, is given in the margin. 


PROBLEM III 
A science class was studying methods of caring for the skin. The 
teacher described the following experiment and stated the con- 
clusion which had been drawn from it. “A large bottle of each 
of the five leading brands of hand lotion was purchased from 
a drug store. The lotion in each bottle was thoroughly mixed by 
shaking the bottle for three minutes, Five exactly similar water 
glasses, one for each lotion, were set in a row on a table, and a 
piece of filter paper was placed over the open top of each glass. 
Each brand of lotion was tested by pouring a half teaspoonful 
of it on the piece of filter paper. For the first brand of hand 
lotion, drops appeared in the water glass within thirty seconds. 
The other four brands all took longer than one minute, and two 
brands failed to filter through at all.” This experiment shows that 
the first brand of lotion is absorbed by the skin more readily 
than any of the others. р ian 


I. Directions: In this part, you are to do two things: 
Select all statements which could logically be used to support 
the underlined conclusion. Blacken the space under A opposite 
the number of each such statement. 


At the same time, select all Statements which might make the 
underlined conclusion less acceptable. Blacken the space under 
B opposite the number of each such. statement. 


In this part of the test, your decision about a statement should 
not be influenced by whether you believe the idea expressed 
to be true or false. 

Statements for I and II: 


AC 1. The contents of one large bottle of a certain brand 


of hand lotion are exactly like the contents of any 

other large bottle of the same brand of hand lotion. 
Trrele- 2. The liquid which is absorbed most.readily by the 
vant Skin is the most effective in softening the hands. 


—— ННН 


B 


Irrele- 
vant 
AC 
AC 


A 


B 


Irrele- 
vant 


B 


hrele. 
уап 


B 


rele. 
vant 


A 


N 


10. 


1 


14. 
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To be absorbed by the skin a hand lotion need 
not pass through the skin. 


- Hand lotions are of doubtful value. 


The faster a liquid drips through filter paper the 
faster it will be absorbed by the human skin. 


- The pores of the skin are quite similar to the little 


holes between the fibers of filter paper. 

Since each bottle was given a thorough shaking, 
the results for each lotion were typical of the per- 
formance of the lotion in that bottle. 

The “pores” in filter paper are constructed quite 
differently from the “pores” in the human skin. 
The experiment was probably intended to make 
sales for some cosmetics manufacturer. 

Although drops of a liquid appeared in the water 
glass, certain ingredients of the first lotion may have 
been retained by the filter paper. 

The speed with which a lotion drips through filter 
paper is no indication of its effectiveness in soften- 
ing the skin. 

Water will penetrate filter paper but is not absorbed 
by the skin. . 

The obvious way to test the five lotions is to try 
them on the hands of a large group of people. 


The amounts of lotion placed on each piece of filter 


paper were very nearly the same. 


I Р 
1. Directions: Select from the statements already marked under 


(the supporting statements) those wh 


ich you would chat- 


enge because you are not convinced they are true enough 
to be used in supporting the underlined conclusion. Blacken 
the Space under C opposite the number of each such state- 
ment. 


Ir, Directions: Conclusions A, B, and C are stated below. Choose 


the one which seems to you to be most consistent with your 
analysis of the situation described in the problem. In the 
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block at the top of the answer sheet, blacken the space A, 
B, or C to indicate the conclusion which you choose. 


Conclusions: 

VA. This experiment does not help in deciding which one of 
the hand lotions would be most readily absorbed by the 
skin. 

B. The experiment suggests that the first brand of hand lotion 
is absorbed by the skin more readily than any of the others, 
but the experiment would have to be repeated several 
times. 

C. The experiment shows that the first brand of hand lotion 
is absorbed by the skin more readily than any of the 
others. 


IV. Directions: Hand lotions are commonly used to replace the 
oils in the outer layers of the skin which are lost through 
excessive exposure, washing, and other causes. Hence it may 
be less important to study the extent to which a lotion pene- 
trates the layers of the skin than to study its effect upon the 
surface of the skin. The statements presented below describe 
some activities which have been suggested to study the ef- 
fectiveness of a hand lotion in keeping the skin soft in the 
absence of an adequate supply of natural skin oils. 

Select all statements that describe activities which you think 
would help in studying this effect of a hand lotion upon the 
skin. Blacken the space under A opposite the number 0 
each such statement. 

In this part of the test, your decision about a statement 
should not be influenced by whether you believe the activity 
described could actually be carried out. 

Statements for IV and V: 


= » M a description of the structure of the human 
skin. 

Irele- 16. Find out the names of the companies which mant- 

vant facture each of the brands of hand lotion used i? 


the experiment. 
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A 17. Make a precise laboratory analysis of each of sev- 
eral brands of hand lotion to find out the amounts 
and properties of its principal ingredients, such as 

Titele vegetable oils, water, etc. | | 

Жап. - 18. Repeat the experiment several times with the same 

Я five lotions and under exactly the same conditions. 

AB 19. Set up an experiment in which ten boys and ten 
girls apply a hand lotion to one hand and no hand 
lotion to the other hand once each day for a month 

hs ла and compare the results. 

- 90. Send out a questionnair 
gant users of hand lotion to 
most popular. 

AB 21. Use land lotions regul 


А body and compare the results. 
22. Set up an experiment to compare the natural skin 


I oils to the oils contained in hand lotions. 
"rele. 93, Compare the absorbing power of filter p 


Mr human skin. 
24. Look for published infor! 
good and bad effects of 

hand lotion. 


e to a large number of 
fnd out which brand is 


arly on several parts of the 


aper and 


mation about some of the 
using different brands of 


v. Directions: Select from the statements already marked under 
A only things which you think you or your class in high school 
Could actually carry out. Blacken the space under B opposite 
the number of each such statement. 

In each problem the student is given a paragraph which 

Presumably justifies an underlined. conclusion stated at the 

95e of the paragraph. This is followed by 14 statements. 
?me of these statements are relevant in the seuse that they 
escribe assumptions underlying the argument, and some of 
iem are irrelevant. Some of the relevant statements might 
e used to support the underlined conclusion and the re- 
mainder of them might be used to contradict it. In the first 

Part of the test-the student is asked to decide which of these 

atements are relevant and to mark them as either support- 
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ing or contradicting. In making these judgments, the stu- 
dent is to disregard the degree of truth or falsity which he 
may ascribe to the statements in the paragraph or to the 
statements listed below the paragraph. He is to judge the 
relevance of a given statement solely in terms of the con- 
text of the argument and to decide whether each relevant 
statement supports or contradicts the underlined conclusion. 

In the second part of the test the student's attention is 
directed toward those particular statements which he marked 
as supporting statements. He is asked to indicate the ones 
which he would challenge because he is not convinced that 
they are true enough to be used in supporting the underlined 
conclusion. Since the relevant statements describe assump- 
tions necessary in order to establish the underlined conclu- 
sion, in a sense the student is asked in the first two parts of 
the test to decide which statements are necessary assumptions 
in the argument, and of these, to choose the ones about 
which he is uncertain or is in doubt. 

In the third part of the test the student is asked to choose 
one of three stated conclusions. One of these conclusions ex- 
presses an acceptance, another, a qualified acceptance, and 
the third, a rejection of the underlined conclusion. In each 
problem the student is asked to choose the conclusion which 
seems to him to be most consistent with his analysis of the 
problem. In order to agree with the test key, the student 
should in two problems choose acceptance, in four problems 
choose qualified acceptance, and in two problems choose re- 
jections of the underlined conclusions. 

Parts I, IT, and III of the test can be given and scored in- 
dependently of the remainder of the test, and for some put- 
poses may be considered sufficient. However, besides being 
able to test a stated conclusion (as in parts one and two) 
by an examination of the assumptions underlying the argu- 
ments which purport to establish this conclusion, it is also 
important to be able to recognize fruitful lines of further 
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investigation and to distinguish between types of activities 
Which are relevant to testing the conclusion and those which 
àre not. It may also be considered important for students to 
learn to judge the practicability of a proposed line of inves- 
tigation. Parts IV and V of the test were designed to secure 
evidence on the abilitv of students to appraise the relevance 
and practicability of proposals for the further study of a 
problem. : 
_ Tn the fourth part of the test a significant problem which 
involves further study of the issues raised in Parts I, II, and 
ПІ is stated, The student is asked to select from a list of 
Statements those that describe activities which would help 
him to solve this problem. In making his judgment, the 
Student is not to be influenced by whether he believes the 
activity described could be carried out in a practical sense. 
n the fifth part of the test the students attention is 
directed toward those particular statements which he se- 


ected in Part IV. He is asked to indicate the ones which 
Ne thinks he or his class in high school could actually carry 


Out, 
The scores given to students reflect their success or fail- 
с lures in each part of the test. 


Ше in carrying out the procec À 

he interpretation of the results depends upon the inter- 
Preter’s understanding of the structure of the test problems. 

le usefulness of the test results is in direct proportion to 
the Extent of the interpreter’s concern with the objective and 
“US confidence that significant behaviors involved in the oh- 
Jective are actually sampled in the different parts of the test. 
c Scores on Form 5.22 


А During the experimental stages of Form 5.22, the test re- 
Sults for a sample population of 307 students were studied 
Mtensively in an attempt to discover the most convenient 
and most meaningful form for reporting the results. These 
Students comprised 12 separate classes divided among the 


RI s ; 
‘Mmarization and Interpretation of th 


> 
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| tenth, eleventh, and twelfth grades. The procedure de- 
scribed previously in connection with the test on Applica- 
tion of Principles of Logical Reasoning was also used in 
this case. Twenty-two scores were summarized for each 
student, and several additional scores were computed from 
these during the study. Certain important scores were se- 
lected and studied with reference to the item analysis in an 
effort to see more clearly the relationships between each of 
these scores and the responses of students to individual test 
items. Certain correlations between the various scores which 
had been summarized were run. It was found possible to 
reduce the number of scores on the data sheet from 22 to 18 
without an appreciable loss of information. Scores on per 
cent accuracy, computed as number of responses marked in 
agreement with the test key divided by total number of re- 
sponses of that kind, were tried and abandoned because 
they were somewhat unreliable and apt to be misleading. 
Moreover an examination of the scores on various kinds of 
errors which were also summarized yielded the desired in- 
formation in a slightly different form. A score on the per 
cent of the statements keyed as supporting and marked by 
students as supporting which the students also marked as 
critical was tried in an effort to secure an index of the “criti- 
calness” of a student. This score was found to correlate 
highly with a score on critical statements marked by stu- 
dents as critical statements. Hence a score on critical state- 
ments marked as critical was used as an index of the tend- 
ency of a student who had marked a statement as supporting 
to challenge its truth. This score when used as an index is 
not subject to the criticism that it depends upon the number 
of supporting statements which the student marked as sup- 
porting since the effect of this dependence was considered 
and found to be insignificant. 

?? See pp. 122-123. 


| 
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The scores on this test can be interpreted in terms of the 
answers to five questions: 


l. To what extent does the student recognize relevant 
phases of an argument, and distinguish between con- 
siderations which support and ones which contradict 
a stated hypothesis or conclusion? 

. To what extent does the student challenge the as- 

| sumptions underlying an argument, and distinguish 

| between assumptions which, from the point of view 
of a committee of adults, should and should not be 


bo 


challenged? 
8. How do the conclusions reached by the student com- 


pare with those reached by the committee who made 
the test? 
4. To what extent does the student recognize the rele- 
vance of proposals for the further study of a problem? 
To what extent does the student judge the relevant 
activities as practicable, i.e., distinguish between ac- 
tivities which, from the point of view of a committee 


of adults, are and are not practicable? 


By a study of the various scores reported on the data sheet 
the teacher may obtain evidence relative to each of these 
questions. It is particularly true of this test that the number 
of patterns of behavior revealed by the test scores is almost 
as great as the number of students who take the test. Each 
pattern should be considered as a unique situation to be 


interpreted. 


л 


Ултлштү AND RELIABILITY OF Test Form 5.22 


The construction of Form 5.22 of the nature of proof test 
Was undertaken in the light of a good deal of negative and 
Some positive evidence on the behaviors of secondary school 
Students telative to the nature of proof objective. Certain 

don'ts” were clearly indicated by experience with previous 
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forms of the test. For example, in Form 5.21 the dependence 
of each step upon preceding steps made the interpretation of 
the test results difficult. At the same time, a number of “do's” 
were indicated. For example, the realization that the basic 
assumptions upon which a conclusion depends may be ex- 
pressed in the form of statements which either support or 
tend to contradict the conclusion (as opposed to statements 
which are irrelevant) made it possible to get at the concept 
of assumptions in operational terms. 

In approaching the construction of Form 5.22 of the nature 
of proof test, a need was felt for another check upon the 
direct responses of students. The students in a geometry class 
of a large public high school not participating in the Eight- 
Year Study were selected for this purpose. The teacher of 
this class was known to be working actively to improve the 
achievement of this objective. For purposes of illustration, 
one of the four test exercises which were given is reprinted 
below together with the responses which one student made 
to the questions. 


Exercise II 


Read the paragraph and then answer the questions which follow. 
Speed is not at all important. You should take enough time to 
organize your ideas and to state them precisely. 


In an agriculture class the teacher was discussing the importance 
of the use of fertilizer. He described the following experiment: 
“Some wheat seeds were planted in two large pots of earth. The 
seeds were of the same variety, and the soil used had been thor- 
oughly mixed and then divided into two parts, one for each pot. 
Fertilizer was added to one and not to the other. The pots were 
then placed side by side in a greenhouse and both regularly and 
equally watered. At the end of three months the wheat plants in 


the fertilized pot weighed twenty-five per cent more than those 
in the unfertilized pot.” 


The class came to the following conclusion: “Farmers who use 
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this fertilizer on land on which they raise wheat will get larger 
yields of grain.” 


1. Indicate your reaction to the underlined conclusion by a check 
mark (\/) in one of the three spaces provided. 


After a consideration of this experiment I feel that the under- 
lined conclusion is: 


Probably true \/, Completely uncertain —, Probably false —. 


Explain your answer in some detail, that is, tell why you felt 
that the underlined conclusion was probably true, completely 
uncertain, or probably false. 

“I felt that the underlined conclusion was probably true, because 

if the fertilizer had been placed in the pot where the wheat seed 

grew the faster, then that would prove it. 

"Especially if the soil had been mixed thoroughly and the pots 

Watered equally each day." 

2. What things does the class have to assume (take for granted) 
if the underlined conclusion is to be considered true? You may 
include statements of ideas reported in the above experiment 
and also statements of ideas not actually mentioned. Make a 
Separate statement for each assumption which you wish to 
point out, and number these statements 1, 2, 8, ..... 


“1. The wheat seeds were the same. 
2. The soil was thoroughly mixed. 
8. The two plants were regularly and equally watered. 
V4. All wheat, even from the same crop, grows the same as 
the rest.” 

8. Now go back to the statements which you listed under point 
2 above. You may feel that some of these statements should 
not be assumed or taken for granted. Place a check mark (V) 
beside the number of each statement which you feel should 
not be taken for graiited. 

you which, if true, would contradict the 


4. What things occur to 
? Make a separate statement for each 


underlined conclusion 5 
Contradictory idea and number these statements 1, 2, 8, ..... 
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“J. Two wheat seeds, even from the same crop, would most 
likely not grow the same, even under the same condi- 
tions.” 

5. In what ways could the above experiment be improved? Make 
a separate statement for each suggested improvement and 
number these statements 1, 2, 8, ..... 


“1. Take more than two pots and then let them grow under 
the same conditions, because the more seeds you use, 
the more perfect will be your conclusion. 

"2. Take soil from same general location and mix, putting 
one with fertilizer and one without. Use water equiva- 
lent to general rainfall in location from which soil is 
taken from, and at approximately the same intervals. 

"3. Run tests over a greater period of time." 


Several significant observations were made from this in- 
vestigation. The rather weak responses which the student 
quoted made to question 1 (the general direction was "ex- 
plain your answer in some detail") are typical of this sample 
of students. In response to question 2 some of the students 
Wrote out basic assumptions which went beyond a mere 
repetition of the statements made in the paragraph. Other 
students found even more difficulty at this point than did 
the student whose responses are presented above. The re- 
Sponses to question 3 are dependent upon the responses to 
question 2 and as a result were significant only for students 
whose performance on question 2 was satisfactory. The re- 
sponses to question 4 seldom yielded new ideas not pre- 
viously expressed in the answers to questions 1 and 2. An 
appreciable number of the students introduced several new 
ideas in their responses to question 5. The student whose 
responses are presented above is an example. In summary, 
the results were as follows: (1) The general direction “Ex- 
plain your answer in some detail," does not elicit detailed, 
comprehensive answers. (2) There is a considerable differ- 
ence in the minds of some students between locating as- 
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sumptions upon which a conclusion depends and suggesting 
ways for improving the argument upon which a conclusion 
depends. In the light of the first point, we would expect dif- 
ficulties if we attempted to compare the written responses of 
students to the general direction “Explain your answer in 
some detail" to their responses on an objective test. In the 
light of the second point, it may be worthwhile to include in 
an objective test two logically equivalent forms of questions 
relative to underlying assumptions: (a) pick out the state- 
ments of underlying assumptions, (b) pick out the state- 
ments of activities relevant to improving the argument. The 
reader will recall from his study of the simple problem that 
an attempt was made in constructing Form 5.22 of the Nature 
of Proof test to include questions of these two kinds. 

The construction of Form 5.22 of the Nature of Proof test 
was undertaken by a committee of five persons with the as- 
sistance at certain stages of several other persons. The test 
situations and test directions were viewed critically in the 
light of all of the available evidence from previous forms of 
the test. An analysis of the statements made by various stu- 
dents provided helpful suggestions for the construction of 


statements to be included in the objective form of the test. 


The kinds of irrelevant statements which the students made 


Were especially helpful in building irrelevant statements 
which would be used as relevant by an appreciable number 
of students. The results of the statistical study which is de- 
scribed below indicate that the directions to the students are 
unambiguous, and that several distinct behaviors are meas- 
ured by the test. The evidence available to date strongly in- 
dicates that, under certain conditions of administration, 
Form 5.22 of the Nature of Proof test provides a valid meas- 
ure of a certain range of behavior relative to the nature of 
Proof objective. 
For the purpose of statistica 
nishing grade ten, 96 in gra 


] analysis, 307 students—115 
de eleven, and 96 in grade 
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twelve—were selected. These students were all attending 
public high schools when tested and composed five elis. 
in grade leu, three classes in grade eleven, and four classes 
in grade twelve. The five classes in grade ten, and one of 
the classes in grade twelve, were then completing a course 
which emphasized the nature of proof objective." In the re- 
maining groups there was an awareness of this objective, but 
less specific attention to it. The results of the study seem to 
indicate that at the present there would be little adv antage 
in computing g grade norms, since the emphasis giv en to the 
objective has more influence on the scores than does the 
grade placement of the students from the tenth to the twelfth 
grades. 

The statistical data presented in Appendix II, Table 7, 
are based on this population of 307 students. Within limita- 
tions these data would apply to other groups of students in 
the tenth, eleventh, and twelfth grades. If a chosen group is 
comparable to the sample group, the statistical constants pre- 
sented in Appendix IL, Table 7, will provide enough basic 
information to enable the teacher trained in statistics to 
study the significance of changes in the mean scores of a 
class or in the scores of an individual student. The reliabil- 
ities of the various scores are in general not as high as have 
been obtained in other tests of thinking abilities. A number 
of the scores are, however, fairly reliable, and it is a reason- 
able hypothesis that the interpretations drawn on the basis 
of a careful examination of the patterns of scores are more 
trustworthy than the reliability of the separate scores would 
suggest. 


A RELATED INSTRUMENT 


A group of objectives which are closely related to those 
discussed in connection with the discussion of Logical Rea- 


22 This course followed somewhat the pattern outlined by Fawcett, loc. 
cit. s 
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soning and the Nature of Proof relate to what is popularly 
known as "propaganda analysis.” During the Eight-Year 
Study some attention was given to evaluation with respect 
to these objectives. This section will give a brief account of 
this project. 

The definition of propaganda 
lows: Propaganda represents any use of the spoken or writ- 
ten word, or other forms of symbolization (pictures, movies, 
plays) designed to convince people to hold certain opin- 
ions, to give allegiance to a particular group or cause, or to 
pursue some kind of social action predetermined by the 
source of the propaganda. As used in this sense, propaganda 
has no unpleasant or *bad" overtones. Our concern with it is 
to better understand which groups are selling what kind of 
propaganda; the possible social consequences and implica- 


tions of this; the symbol appeals which are used and their 


relation to behavior dynamics of individuals; the relation of 
р ) Us 
al conditions; ete. 


Susceptibility to propaganda to soci 

Propaganda also is used to characterize forms of argument 
which are untenable in terms of certain intellectual or logical 
criteria such as: documenting evidence, presenting several 
Sides of a problem, drawing conclusions which follow logi- 
cally from the data, minimizing the use of slogans and “emo- 


tional" terms, etc. Used in this sense propaganda does have 


unpleasant overtones and our problem is to teach pupils to 
f good argument. 


react critically to it by applying criteria 0 

The scope of this report takes both of these definitions into 
Consideration. 

_ Among the beha 
jectives of education related to propag 
following: 


which was adopted is as fol- 


viors which were listed as important ob- 
anda analysis were the 


purposes of authors of propaganda— 
ake more discriminating judgments as 
s intended the consumer 


a. Recognition of the 
that is,'ability to m: 


to the points of view which it i 


150 ADVENTURE IN AMERICAN EDUCATION 


should accept or reject. (In the broad sense, this refers 
to the generally accepted concept of “reading compre- 
hension.” ) 

b. Identification of the forms of argument used in selected 
statements of propaganda. (This refers to reading com- 
prehension in a different sense.) 

c. Recognition of forms of argument which are considered 
intellectually acceptable and which are not employed in 
certain statements. 

d. Critical reaction to the forms of argument which repre- 
sent typical devices employed in propaganda. 

e. Ability to analyze argument in terms of principles of the 
nature of proof. 


f. Recognition of the relation of propaganda to the social 
forces which breed it. 

5. Knowledge of the psychological mechanisms involved in 
the susceptibility of people to certain language symbols. 


The evaluation instrument entitled Analysis of Contro- 
versial Writing (Form 5.31) was developed to obtain evi- 
dence concerning the achievement of the first four behaviors 
listed above. Item e in the list has been discussed at some 
length above. The others, although they were considered 
important and some preliminary analyses of them were 
made, were not explored during the study. The test contains 
ten samples of writing on controversial issues selected from 
magazines and newspapers. The choices were made on the 
basis of the following criteria: (1) the selection. should 
focus upon a controversial issue; (2) liberal and conserva- 
tive sources were represented on each issue; (3) the group 
of selections should make use of a variety of propaganda 
devices; (4) the issues involved should represent areas of 
tension for pupils. 

In each problem the pupils were first directed to read the 
quotation carefully, and then in Part I to mark 


quo them so as to 
indicate statements where there is: 
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A. evidence that the author of the quotation wants you to 
agree with or accept the idea in the statement. 
B. evidence that the author wants you to disagree with or 


reject the idea in the statement. 
C. no evidence as to whether the author wants you to agree 


or disagree with the idea in the statement. 


Twelve statements follow these directions. The examples 
below are taken from Problem I, based on a selection whose 
tenor may be judged from the closing sentence in one para- 
graph: "The American system of private industry and busi- 
ness has distributed more income to more people than any 
other system in the history of the world." 
ng power of workers is possible only 


ate ownership of industry. 
r wages than they receive 


l. The present purchasi 
under a system of priv 

2. Workers should receive highe 
at present. 

8. The present system of private ow 
any other way of organizing industry. 

4. Industry still has far to go in distributing wealth more 
evenly between the workers and the owners. 

5. The profits of corporations should be turned over to the 


workers rather than to stockholders. 


mership is superior to 


In Part II, the student was to decide:** 
first, which of the following stateme 
tuation, and second, which 


ments used by the author in this si 
Ones represent desirable forms of argument whether used by the 


author or not. 


1. Assumes that the point 0 
that which is held by the majority of Americans. 


9. Gives facts in such a way that the reader can check their 
source to see whether they have been reported accurately. 

8. Uses statistics for industries in which wages are among 
the highest to illustrate the rise in wages. 


?! The following quotation is an excerpt from the directions. 


nts represent forms of argu- 


f view expressed in the article is 
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4, Presents some of the major advantages and disadvantages 
of our system of private ownership of industry. 

5. Indicates that there will be undesirable consequences to 
industry if our present industrial system is changed. 

6. Tries to make us feel sympathetic toward industrial 
owners. 


Ten statements of this general sort were used in Part II of 
each Problem. In both parts the various statements were so 
chosen that a student responding according to the direc- 
tions could reveal evidence of his status with respect to the 
first four behaviors listed above. 

The scores of the pupils in Part I are tabulated in the fol- 
lowing descriptive categories:?* 


General Objectivity. Scores in this category represent the per 
cent of total correct responses and show the relative objectiv- 
ity with which the pupil interprets highly biased material. 

Non-Recognition of conflicting points of view. Pupils who have 
difficulty in recognizing ideas which are contradicted by the 
author's data can be identified through scores in this category. 

Misconception of author's purposes. Scores in this category indi- 
cate a pupil's tendency to attribute conservative ideas to liberal 
articles and liberal ideas to conservative articles. Such scores 
indicate a kind of gross error in judgment and, if relatively 
large, suggest inability of the pupil to comprehend the general 
ideas which the authors are trying to sell to the reader. 

Suggestibility. Scores in this category indicate the extent to which 
the pupil indiscriminately attributes conservative ideas to con- 
servative articles and liberal ideas to the liberal articles. (A 
score of this kind means that the pupil says that the author 
wants him to “accept” an idea which is keyed “insufficient evi- 
dence.” The items keyed “insufficient evidence” reflect. the 
general point of view in the articles. ) 


Except for the category “general objectivity,” the scores 
in Part I categories are separated into “liberal” and “conserv- 
?* А more detailed description of how these categories are derived from 


the test scores and how they are to be interpreted can be found in the 
“Explanation Sheet and Interpretation Guide” for Form 5.31. 


—m- 


dae 
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ative.” Thus in the "suggestibility" category each pupil has 
two scores, one showing his suggestibility in interpreting the 
conservative articles and one showing suggestibility toward 
the liberai articles. 

The scores on Part II are tabulated according to the fol- 
lowing categories: 
hniques used in the articles. This 


to which the pupil can recognize 
as "propaganda tech- 


Identification of propaganda tec 
category indicates the degree 
the use of the forms of argument keyed 
niques.” 

Confusion of propaganda te 
category shows the extent 
the techniques keyed as “not use 

Uncritical toward the use of propaga” 
ency of the pupil to approve the use of propag 
is indicated under this heading. 

Recognition of acceptable nature 


Recorded in this category are se 
approves of the use of the acceptable forms of argument. 


Gullibility. Scores in gullibility show the tendency of the pupil 
to indicate that the acceptable forms of argument keyed as 
“not used" are used in the articles. Due to the nature of the 


test items, gullibility means attributing "fairness, impartial- 
i э bs 79 "c [ arti (v 
ity," "open mindedness" to the authors of the ar ticles. 


In constructing Part I of the test, the basic hypothesis was 
Е 1 the five social issues in- 


that pupils whose attitudes towarc 
cluded in the test were strongly liberal would tend to be 


More "suggestible" toward the conservative articles than fto- 
у о : : 
Ward the liberal articles. This W 1 оп the notion that 


as based 
the liberal pupil would more willingly exaggerate the ideas 
° 
Ё conservative authors than thos 


e of liberal authors. Simi- 
arly į res 1 pupils in 
arly it was believed that the scores of suci pup the 
Other columns of Part 1 would tend to differ as between the 
а c > » ewm 
Sub-categori эв "liberal" and “conservative.” To check this 
hypothesis an attitude scale consisting of items used in the 
test жа? : " 
_ Was given to approximately one 


chniques used and not used. This 
to which the pupil indicates that 
d" were used in the articles. 
da techniques. The tend- 
anda techniques 


of certain forms of argument. 
ores showing whether the pupil 


hundred pupils. These 
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same pupils took Form 5.31 and their attitudes were com- 
pared with scores in the “suggestibility” category in the test. 
This study showed that “liberal” pupils were no more sug- 
gestible toward conservative articles than conservative pu- 
pils, and vice versa. Furthermore, a study of test scores has 
shown that most pupils tend to be equally suggestible toward 
conservative and liberal articles. This same tendency is char- 
acteristic of the other categories in Part I. 

The conclusion justified from these findings is that the test 
does not discriminate sharply between the reactions of lib- 
eral and conservative pupils in their interpretation of the 
purposes of the propaganda articles. Sharper differences are 
discovered when scores on individual articles are compared, 
for example, scores on the liberal and conservative articles 
dealing with the issue of socialized medicine, This procedure 
is cumbersome, however, and would be impractical for use 
with large classes. Other hypotheses underlying the test seem 
to be reasonably valid. As one phase of a validity study, 50 
essays by pupils who analyzed a subtle piece of propaganda 
as part of a unit of work on this subject were compared with 
the test results. The studies of validity and reliability are 
not complete, however. The instrument has been described 
because it illustrates an approach to this problem which is 
somewhat unique and which warrants further study. 


CONCLUSION 


The two principal uses for these types of instruments are: 
(1) the diagnosis and description of the strengths and weak- 
nesses of individual students and of groups of students in 
relation to the objectives as they have been operationally 
defined in the tests; (2) the measurement of growth in the 
abilities required for successful achievement. The scores on 
the data sheets will yield significant descriptions of students 
with respect to the objectives. The interpretér must, how- 
ever, clearly understand the structure of the test problems 
and the relationship of this structure to the problem-solving 
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process. For certain students the interpreter may desire even 
more detailed evidence from the test results than that which 
appears on the data sheet. An examination of the responses 
of a particular student to certain items on a test may yield 
such evidence. More often the suggestions raised by an ex- 
amination of the data sheet will lead the teacher to seek 
evidence from other sources to confirm or deny these sug- 
gestions. For example, a student may reveal a tendency to 
use many reasons on the nature of proof test but fail to 
discriminate between relevant and irrelevant reasons. This 
tendency may or may not be confirmed by the teacher's ex- 
perience with the student in daily classroom activities. 

The uses of these instruments are not fundamentally dif- 
ferent from those of many other types of tests. Thus after 
Studying the test results the teacher may wish to provide 
curriculum experiences designed to overcome obvious weak- 
nesses of a group as a whole, or of individuals within the 
group. This may lead to a special unit of work for the whole 
Class; special assignments undertaken by a particular student 
With the advice of the teacher; special attention by the 
teacher to certain details of the written work handed in by 


One or more of the students; and the like. In other cases, 
growth toward this objective might be one of the desired 
longer period of time. 


Outcomes of the work of a class over а 
or example, every activity of a class over a period of a year 


might be designed to make some contribution to the students’ 
Concept of proof. 

In this connection it wi 
Of individuals and of classes toward 
the students may remember the general nature of these tests 
for several months, they can scarcely be expected to re- 
member the anwers to specific items on the test. Hence the 
Practice effect of taking the tests once will probably not be 
a serious factor influencing the scores on а second administra- 
tion of a test several months later. 1f such studies of growth 
are desired, it is especially important, of course, that the 


1] be useful to measure the growth 
the objective. Although 
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specific exercises in the tests should not be “taken up in 
class.” It is also important to keep in mind the effect of the 
total testing situation upon the test results. This total situa- 
tion involves more than a careful explanation of the test 
directions to students, and the provision of adequate time for 
the completion of the test. In the case of many tests, and 
particularly those which have been described, it involves 
also the “readiness” of the class for the test, their attitude 
toward the test as a diagnostic instrument rather than as a 
marking device, and the like. Ideally, the class should look 
upon these tests as an opportunity to demonstrate their abil- 
ity to do clear thinking rather than as a burden and a threat. 

The chief feature of all of these tests is the extent to which 
they make possible a description of a student's thinking abil- 
ity in terms of at least tentative answers to a series of ques- 
tions which are quite general and comprehensive. Success- 
ful performance depends relatively little, compared with the 
usual achievement test, upon knowledge of particular bodies 
of subject-matter content, and relatively much upon broad 
principles of science and of scientific thinking. The objec- 
tives demanded tests to probe among the higher mental 
processes applied not to materials of the sort commonly used 
in psychological investigations, but rather to those commonly 
found in reading of newspapers and magazines, or elsewhere 
in daily life. This approach is fundamentally different from 
one which seeks to synthesize a description of a student's 
thinking abilities from data on many simpler but more read- 
ily controllable psychological reactions. The experience of 
the Evaluation Staff has been that this endeavor has led to 
increasing complexity in the test instruments in spite of the 
demands of practicality for greater simplicity. This increas- 
ing complexity was tolerated in order to maintain close cor- 
respondence between the stated objectives and the behavior 
demanded of the student, and in the hope that the instru- 
ments of this sort may eventually be simplified. 


Chapter ПІ 


EVALUATION OF SOCIAL SENSITIVITY 
KEI 


RE KE KEAS 


INTRODUCTION 


Origin and Scope of the Objectives Related to the 

Development of Social Sensitivity 

In any social situation, an individual is aware of, and re- 
sponds to, certain factors and lets others go unnoticed. Thus. 
an selling apples on the street corner. 
„are only of the convenience of 
vailable to him, or be annoyed at hav- 
e street corner. The awareness and 
self-centered; there is little 


Another person may “see” 


on observing an old m 
one individual may be aw 
having apples easily a 
ing the man clutter up th 
attendant feelings in this case are 


consideration for the apple man. 
Primarily an old man trying to make a living. He may in 
addition feel sympathy for a man who has to make a living 

г x feel that this way of earning a 


i 1 ч 
7 Such a precarious мау, or sae 
lving is the man’s just due, determined by his ability. Atten- 
tion in this case is centered on the apple man as a human 

suggest the 


eing. To a third person this experience may 
Problem of security in old age He may wonder why there is 


Dot a more satisfactory provision for old people to make their 
living. Awareness and sympathy in this case center not ouly 
9n the apple man. He becomes à symbol for a whole group 
of people, or for an issue: and sympathy for him is likely to 
San concern for the problem or issue which he ет 

pending on the type of response, various impulses to ac 


tion may also suggest themselves. Annoyance with the apple 
st activity leading to his removal. Sympathy 


onsideration of ways of helping 
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m 
ee 
ward him may lead to ¢ 
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him. Concern about injustice in the social order tends to sug- 
gest the need for correcting them. 

Several different behaviors are involved in these responses. 
Personal sympathies and aversions largely determine the pat- 
tern of initial awareness. The knowledge one possesses, and 
the attitudes and viewpoints one has, determine how one 
interprets the experience. Оз ability and inclination to 
relate and reorganize ideas gained from previous experiences 
and to apply them to the new situation add insight. The 
inclination and ability to relate the feelings evoked and posi- 
tions taken in Specific situations to more general and abstract 
ideas add to both the coherence and the depth of one's in- 
sights in a given case. All of these behaviors, although capa- 
ble of analytic distinction, are related to each other in any 
given experience. 

The term “social sensitivity" has been used to refer to this 
complex of responses. In the common usage of the term the 
emotional factors—such as the feelings of sympathy or aver- 
sion, attitudes of approval or disapproval—have been em- 
phasized. However, this term can also be used to connote the 
intellectual responses—such as the range and quality of the : 
elements perceived in a given experience or the significance 
of the ideas associated with it. 

In the first statements of objectives submitted by the 
Schools in the Eight-Year Study the term "social" was used 
in connection with many types of behavior somewhat similar 
to the ones described above. Frequent among the statements 
Were terms such as social consciousness, social awareness, 
Social concern, social attitudes, social integration, sense of 
social responsibility, social understanding, social intelligence. 
Thus many schools seemed interested in promoting a greater 
awareness of social aspects of the immediate Scene as well as 
of the issues underlying current social problems. At the same 
time concern was expressed that unless students achieve 
clarification of their personal patterns of social values and 
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beliefs, intelligent social thinking would remain an elusive 
object of educational effort. The apparent blocking of ra- 
tional thought by personal prejudices and biases, by a 
warped sense of values, or by the tendency to react in terms 
of social stereotypes, was recognized, and many statements 
of objectives emphasized the importance of a clearer, more 
Consistent, and more objective pattern of social values and 
beliefs, A good deal of attention was also devoted to the 
problem of helping students apply the values, loyalties, and 
beliefs they developed to an increasing range of life prob- 
lems. The term “social sensitivity" was adopted to serve as a 
Consolidating focus for this apparently heterogeneous yet 
highly related complex of objectives. 

In order to see more clearly what was implied in these 
Statements of objectives from the schools, two committees 
Were established. These committees undertook to make a 
Coherent analysis of social sensitivity as а total objective and 
to clarify and specify some of the more crucial aspects of it 
Sufficiently to lay a foundation for the development of eval- 


uation instruments. Some of the significant aspects of social 


ensitivity which were emphasized in the course of the 


analysis are described in the following section. 


Significant Aspects of Social Sensitivity 
The first exploratory meetings of the committees revealed 
а diversity of concepts regarding social sensitivity. In the 
COurse of ‘the discussion sensitivity was defined, by implica- 
lon, as awareness, ways of thinking, interest, attitude, and 
knowledge. A whole range of problems representing signifi- 
Cant areas of social sensitivity was also mentioned. These 
ranged from such “immediate” social patterns as relations 
With other people to stich general social issues as unemploy- 
"ent, effective democracy; and social justice. h Р 
, То get a clearer and а more concrete picture of the specitic 
Pehavior involved in this objective, the committee under- 
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took to collect anecdotal recordings of behavior incidents 
illustrating any aspect of social sensitivity which teachers in 
the Thirty Schools thought important. This material was 
carefully analyzed and the various types of specific behavior 
were listed. Altogether, 74 types of behavior were indicated 
or implied by the anecdotes submitted by committee mem- 
bers and other teachers. The list below gives a few illustra- 


tions: 


18 


The student frequently expresses concern about so- 
cial problems, issues, and events in conversation, free 
writing, creative expression, class discussion. 


. The student is fairly well informed on social topics; 


he has a reasonable background and perspective, and 
would not often be misled by misstatements. 


. When facing а new situation, problem or idea, he 


is eager for more information, seeks to identify sig- 
nificant factors in the situation, carries thought be- 
yond the immediate data. 


. He is critical about expressed attitudes and opinions 


and does not accept them unquestioningly; distin- 
guishes statements of fact from opinion or rumors, 
discerns motives and prejudices. 


. He is able to discern relevant issues and relationships 


in problems, ideas, and data. He relates ideas widely 
and significantly and discriminates among issues. 


. He judges problems and issues in terms of situations, 


issues, purposes, and consequences involved rather 
than in terms of fixed, dogmatic precepts, or emo- 
tionally wishful thinking. 


. He reads newspapers, magazines, and books on social 


topics. 


. He is able to formulate a personal point of view; he 


applies it to an increasingly broader range of issues 
and problems. i 
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9. He is increasingly consistent in his point of view. 

10. He participates effectively in groups concerned with 
social action. 
| A classification of these behaviors resulted in the following 
list of major aspects of social sensitivity of concern to teachers 
in the Thirty Schools: 

l. Social thinking; e.g. the ability (a) to get significant 
acts, (b) to apply social facts 


meaning from social Ёа 
and generalizations to new problems, (c) to respond 
o ideas and argu- 


critically and discriminatingly t 
ments. (Statements 4 and 5 above, for example, would 


fall into this classification.) | 
Social attitudes, beliefs, and values; e.g. the basic 


personal positions, feelings, and concerns toward 
social phenomena, institutions, and issues. (State- 


ments 8 and 9.) 

8. Social awareness; that i 
factors or elements perceived in 
ments 1 and 6.) 

4. Social interests as reve 
sociallv significant activities. 
10.) - 

5. Social information: 
generalizations relevant to signi 


(Statements 2 and 8.) | . 
6. Skill in social action, involving familiarity with the 


techniques of social action as well as ability to use 
them. (Statement 10.) 
] sensitivity took the responsibility 
aluating three of these as- 


ю 


s, the range and quality of 
a situation. (State- 


aled by liking to engage in 
(Statements 8, 7, and 


that is, familiarity with facts and 
ficant social problems. 


The committee on socia 
= developing instruments for evalua! ies л 
Dects; namely, the ability to apply social generalizations an 
Acts to social problems, social attitudes. and social aware- 
Ness. The present chapter is chiefly devoted to a description 
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of the instruments pertaining to these aspects. Instruments 
dealing with other phases of social thinking—such as the 
interpretation of social data, and critical reactions to argu- 
ments and propaganda—have been discussed in the chapter 
on Aspects of Thinking. The appraisal of social interests 
is discussed in the chapter on Interests. No new instruments 
were developed to evaluate the acquisition of social informa- 
tion, primarily because published tests were already avail- 
able and because teachers felt relatively little need of assist- 
ance in this task. As far as securing evidence of skill in social 
action is concerned, observational records seemed to be the 
most effective method. These are discussed briefly in the 
following section. 


INronMAL METHODS or GETTING EVIDENCE ON SOCIAL 
SENSITIVITY 


An objective which involves as diverse types of behavior 
as those described in the preceding section obviously neces- 
sitates the use of several approaches and several techniques 
for its appraisal. These will include paper-and-pencil tests as 
well as observational techniques, each being employed ac- 
cording to its appropriateness to the behavior that is being 
evaluated. Thus the ability to think through social problems 
can be adequately appraised by using paper-and-pencil tests. 
For the evaluation of some other aspects of social sensitivity, 
such as the ‘identification of social beliefs, paper-and-pencil 
tests are recommended chiefly because .they are economical 
and because these behaviors are rather difficult to observe 
directly and objectively. Still other types of behavior, such as 
the disposition to act on one’s beliefs, or the degree of par- 
ticipation in social action and in discussion of social prob- 
lems, require direct observation of overt behavior. Many of 
these observational and informal techniques involve only a 
more effective use of procedures employed ‘and materials 
secured in the course of normal teaching procedures. 
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Anecdotal records are an effective way of securing con- 
crete descriptions of significant behavior of individuals or 
groups. Since they are a way of recording direct observa- 
tions, anecdotal records are appropriate for securing evi- 
dence on all types of overt behavior. However, since such 
a descriptive record is highly time-consuming, the function 
of anecdotal records in a comprehensive evaluation program 
is usually supplementary: to give vivid, intimate, concrete 
material to help make more meaningful other more sys- 
tematic but less colorful types of evidence. The nature and 
role of anecdotes and the criteria for selecting and writing 


them have been described elsewhere.’ Here it may suffice to 


give a few illustrations of anecdotes pertaining to social 


sensitivity. 

A disposition on the part of a group to consider the effects 
of one's actions upon the welfare of others, and to apply 
ethical principles in making decisions, is illustrated by the 


following incident: 
en supported by the income from 


1l neighborhood stores which the 
tioned the ethics of 


assembly. Others in 
paper defended the 
practice with school 


The school newspaper had be 
advertising solicited from sma 
students did not patronize. A student ques 
such a procedure in the student government 


© P 
harge of the business management of the 
asa general 


method on the grounds that it w 
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y 
are capable of using present events to speculate about their 
consequences. 


In connection with a report of a demonstration by members of 
the League for Industrial Democracy protesting against the “Rex” 
sailing with munitions for abroad, speculation was aroused re- 
garding the consequences of an embargo. How effective would 
government control of the sale of munitions be? What devious 
ways, such as selling to a neutral country, would be devised? 
(This discussion occurred during the Italian conquest of 
Ethiopia. ) 


< 
Personal attitudes toward social issues are often reflected in 
the daily incidents in the school, as in the following: 


Gene came into my room, explaining that she had had an argu- 
ment with some members of her group over their attitudes dur- 
ing trips they had made to Harlem and the East Side of New 
York. Jane had told her that she could not see how anybody could 
like slumming. Gene had objected to such an attitude, since the 
purpose of the trip was to study the living conditions cf people 
in an unfortunate situation. To her, she said, those trips, together 
with the study of housing and income, had been one of the most 
meaningful experiences. She wants to write on that problem.? 


Students’ writing presents other opportunities for securing 
evidence on social sensitivity. Much writing contains some 
expression of social attitudes and of social values held bv the 
author, provided its content is analyzed from that standpoint. 
Often only a listing of the topics chosen for creative writing 
over a period of time or for free choice “research” reveals 
trends in social sensitivity. Thus, frequent choice of social 
problems to write about or frequent emphasis on social con- 
text and social implications is an indication of real interest 
in social matters. Free choice writiig, however, provides 


i 

“It is possible to interpret the incidents given above in several different 
ways. A single incident does not necessarily prove anything about the 
behavior of an individual and a number of anecdotes covering a period of 
time must be collected before any generalization is attempted, i 
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only sporadic evidence, and not necessarily on the particular 
aspects of behavior a teacher may wish to explore. To secure 
more systematic evidence, controlled assignments in which 
all students respond to the same general problem, issue, or 
experience, are often employed. Below is a sample of written 
responses to the following paragraph assigned as a topic to 
the whole class: “Nothing can be done about poverty. There 
have been and always will be poor people, incapable people, 
unambitious people, dirty work to do, survival of the fit- 


» 


TESE p: gia А 


Roy: I think something could be done about poverty. They could 
be taught many things they have no chance to learn today. They 
should be housed in a healthy environment. I think there will 
always be poor people, unambitious people, incapable people, 
and dirty work to do, but I do not think that a very great per- 
centage of the poor today are poor because of these reasons. 
They don’t have a chance. I don’t think that 42 per cent of the 
Americans today fall into that lazy and unambitious class, yet 
42 per cent of “Americans are poor. There must be something 
wrong with our system today. 

John: I can find little pity for white and colored trash who have 
never amounted to anything. . - - I think that the smarter man 
Should make more money and that it would wreck any advance- 
ment of civilization so to restrict initiative as to pay the man 
Who carries twice the load as much as the mass below him gets. 
Mary: Very few people would at any time . . be willing to 
give their money away. Of course, they can be made to give it 
to the government, but it seems to me to be a shame if people 
are taxed so heavy to aid all the poor. Surely I agree something 
could be done, but I can imagine my own feelings if the majority 
of the voters, who are middle class and poor, should vote for a 
tax that would take away a large part of the money and savings 


Thad worked for and made. 


als the possibilities of this 


Even this limited sampling reve 
ypoints of the stu- 


method of learning about ‘the social view 
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dents. These excerpts reveal an interesting variety of views 
regarding causes and cure of poverty and unemployment. 
Different positions are taken toward taxation. Personal sym- 
pathy for people in different economic circumstances or lack 
of it is shown. One can even gain some idea of the nature 
and degree of awareness of social conditions in each student. 

Records of free choice activities of all sorts often yield sur- 
prisingly useful information. Thus records of free reading 
may give clues regarding students’ social interests, level of 
social awareness, and maturity and direction of social out- 
look. Records of activities of all sorts, in-school and out-of- 
school, such as participation in school government, vacation 
activities, attendance at motion pictures, lectures, and con- 
certs, and other leisure-time activities are also useful, par- 
ticularly when the nature of the activity is recorded in addi- 
tion to its frequency.” Although these records serve primarily 
as evidence of interests, analysis of their content also serves 
for evidence of social sensitivity. 

Free response tests employing a form akin to projective 
techniques are also useful devices for getting at personal re- 
sponses to social issues. Their advantage lies in their indi- 
rection. The individual is not asked directly to reveal his 
social values. He is provided an apparently innocent object 
of attention to which he can respond freely and personally. 
The object of attention is so chosen as to draw out revela- 
tions of his pattern of social sensitivity. In a completely free 
response test, only a brief statement is „given, and students 


are asked to list all of the thoughts that occur to them in con- 
nection with that subject. 


Problem. The following quotation from 

appeared in a daily paper: : 
"Cooking onions—30 cents per bu.” 

Directions: List all of your thoughts about this quotation which 


? For further discussion of the use of reading records and activities records, 
see The Social Studies in General Education, pp. 345-46. 


a local produce market 


—— 
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sonal way interferes with the possibility of assigning his re- 
sponses a precise and fully objective meaning. However, 
when teachers are able to develop valid exercises of this sort 
and take the care and’ the time necessary for a diagnostic 
analysis of the responses, tests of this sort have a real role to 
play, particularly since they can be made more readily an 
integral part of teaching than is the case with more formal 
tests. 


EVALUATION OF THE ABILITY TO APPLY Socian FACTS 
AND GENERALIZATIONS 


The teachers in the Thirty Schools were much concerned 
that students develop a willingness and ability to use social 
facts and generalizations, gained through their study, in un- 
derstanding and explaining social phenomena around them. 
They recognized the futility of the mastery of a background 
of facts without growing in ability to apply them to en in- 
creasing range of social issues met in daily life. In many 
schools a serious attempt was made to give students an op- 
portunity to think through new problems in the light of their 
previous knowledge. For this reason interest was expressed 
in developing some instruments to appraise students" growth 
in ability to apply social facts and generalizations. 


ANALYSIS OF THE OBJECTIVE 


| Prior to the development of instruments several explora- 
tions seemed necessary. First, it seemed important to iden- 
tity the generalizations which were considered fundamental 
to the understanding of social problems and which, there- 
fore, the students were expected to know 
their thinking. It seemed 
scribe the kinds of behavi 
eralizations and facts, Fi 
of problems and issues 
to be able to think thro 


and to apply in 
also necessary to analyze and de- 
or involved ‘in applying social gen- 
nally, some exploration of the areas 
which the students mav be expecte 

ugh was also needed. In order to get 
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some appropriate criteria by which to appraise this aspect 
of thinking, it seemed important to identify some of the de- 
sirable characteristics or qualities of the process of applying 
social generalizations as well as the difficulties encountered 
by the students in achieving these qualities. The following 
sections will discuss these questions in turn and indicate the 


decisions which were made. 


Generalizations and the Processes 

Involved in Applying Them 

| Students are often expected to decide whether certain ac- 
tions—proposed or accomplished—are justifiable, desirable, 


or reasonable. Such decisions as whether an article attacking 


democracy submitted to a school paper should be printed, or 
Whether a certain law should be passed in the legislature, are 
examples. Decisions are presumably made more intelligently 
when the student understands some of the generalizations 
which are applicable and has the pertinent facts available. 
Students may also be expected to explain certain events or 
to predict the probable consequences. Thus, in predicting 
the probable effects of a certain type of sales tax, it is impor- 
tant to consider both what is known about the effects of 
different forms of taxation on various groups in society and 
Certain. general principles of taxation. In determining the 
desirability of the measure in à democratic society, the con- 
Sideration of certain basic democratic values, such as the 
aoe of all groups and individuals, and securing equality 

Sacrifice as far as possible, is also necessary. In much the 


2 way, facts and generalizations are needed in judging 
he soundness of conclusions drawn oF decisions made by 


Other people. 

i effort was made Фу the 
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teachers in schools participating in the Study for additions 
and criticism. Other sources such as Billings’ list of social 
science generalizations and typical textbooks and references 
were also examined.® The final list was again checked by 
teachers to indicate which of the generalizations they con- 
sidered fundamental in understanding social phenomena, 
which of them were emphasized in their teaching, and which 
of them were touched upon but not emphasized. 

The analysis of this list of generalizations raised several 
questions about the nature of social science generalizations. 
In the first place, the line of demarcation between a social 
fact and a social generalization was not clear. Many of the 
generalizations listed as major understandings seemed little 
more than generalized facts and as such had a limited utility 
in explaining social phenomena other than the ones which 
they directly summarized. Thus, the generalization that a 
variety of taxes is levied in the United States adds but little 
to the understanding of the issues of taxation. 

The question of the dependability or the “truth” of many 
of the generalizations was also raised. Many of these gen- 
eralizations seemed to apply only to a limited range of situa- 
tions, and lacked the universality commonly attributed to a 
“principle,” as the term is used in the natural sciences. Often 
these generalizations seemed little more than hypotheses, 
useful in exploring ways of explaining events, but question- 
able for exact prediction. Still other generalizations seemed 
te have little validity independently of a particular social 
philosophy or theory. Some generalizations seemed to be di- 
rect expressions of the social beliefs held by individual 
teachers, and the validity of these beliefs was often ques- 
tioned by other teachers holding différent beliefs. Tt seemed 
clear that the majority of useful and significant social science 


^ Neal Billings, A Determination of Generalizations Busic to the Social 
Studies Curriculum (Baltimore, Warwick and York, 1929), 


а. 
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generalizations were not verifiable in the same sense as are 
the majority of scientific principles. 

It seemed advisable, therefore, to think of social science 
generalizations primarily as tools for further thinking, for 
formulating tentative explanations, solutions and conclusions, 
rather than as bases for precise predictions, as infallible 
guides for action, or as indisputable expressions of “truth.” 
It was finally agreed that the term “social science generaliza- 
tion or principle” would be used to describe any generaliza- 
tion which could be applied to a range of specific situations 
for the purpose of explanation or prediction, whether or not 
this generalization was applicable over an indefinitely wide 
range of such situations or was universally true, precise, or 
verifiable.” 

It was clear also that the different types of generalizations 
Suggested involved differences in the ways in which they 
could be used in the thinking process. On the basis of these 
differences the principles were classified into three types, 
each type perhaps implying a different technique for eval- 
uating its use. One group included descriptive generaliza- 
tions, serving merely to summarize a body of discrete facts. 
Thus, a body of facts about income might be summarized by 
the generalization, “people earn their incomes through a di- 
versified range of activities." Another type of generalization 
Served to indicate cause-and-effect relationships and to ex- 
Plain. social phenomena. Thus, a body of data relating to 
€conomic penetration into undeveloped countries might be 
summarized by some such generalization as “economic pene- 
tration of an undeveloped country frequently results in mili- 
tary and political domination." A third type had to do with 
expressions of value judgments, opinions, or beliefs. Thus, 
the body of facts regarding freedom of speech might be 


TF i ; а Е the term “principle” in curriculum build- 
ing pes US aa га pie Doak S. Campbell, Curriculum Development 
New York, American Book Co., 1935), pp- 87-90. 
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summarized in the principle, “freedom of speech is essential 
to the preservation of democracy.” This sort of statement ex- 
presses a viewpoint or value judgment which is incapable of 
verification in the usual sense of the term. 

The effect of the compilation of such a sample list of gener- 
alizations upon teaching was also considered. Some teachers 
feared that the list would suggest a minimum set of generali- 
zations to be adopted by all teachers and to be taught for 
memorization. It was agreed the preparation of the list should 
not be taken to imply that these generalizations had been or 
should be taught as statements to be learned, but rather that 
through the best learning procedures the students would be 
brought to understand certain generalizations, and that they 
would be given opportunity to apply some of these in their 
school work. The list was to be used as an illustrative sample 
of generalizations for the sole purpose of exploring the possi- 
bility of evaluating students’ ability to apply them. 

Analysis of Behavior 


In the course of the above discussion some of the behaviors 
involved in applying facts and principles to social problems 
have already been indicated. 

As was described above, application of principles and facts 
usually takes place when people are called upon to do any of 
the following: (a) explain certain ideas or phenomena, (b) 
predict consequences of events, (c) decide on a course of 
nclusions, or decisions 
ese situations, provided 
ecessary to be aware of 
‚ A more reasonable judgment 
appropriate use’ is made of whatever 
facts and generalizations are pertinent to the problem. In the 


process of making judgments of this type the following be- 
haviors are involved: | 


ee o 
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1. Relating previously learned facts and generalizations 
to each other and to the given problem. 

Discriminating between facts and generalizations 
which are relevant to a given problem and those 


bo 


which are not. 
Discerning the logical relationship between a par- 
ticular conclusion, decision, or a course of action and 


со 


a generalization or a fact. 

4. Organizing facts and principles learned in different 
contexts in such a way that they can be helpfully used 
in analyzing the problem or in arriving at the con- 
clusion. 


One of the important points brought out in analyzing the 
objective was that the most fruitful use of important facts 
and generalizations takes place when these are applied to 
problems new to the students. Although knowing the facts 
and generalizations themselves was regarded as basic to the 
ability to use them, teachers were primarily concerned in this 
connection with having students develop the ability to or- 
ganize the facts and principles and relate them to each other 
in new ways. Hence, the recall of applications made by other 
People was not considered a behavior to be diagnosed by the 


Prospective instruments. 


Criteria for Appraising the Process of 
Applying Facts and Generalizations 


An analysis of the specific behavi I 
ing is helpful, but it is not sufficient for evaluating that be- 


havior. It is also necessary to indicate certain criteria by 
which to appraise that behavior. Therefore, an attempt was 
also made to outline the characteristics, both positive and 
negative, of the process of applying social science principles, 
which it seemed important and useful to diagnose. 

The following characteristics were suggested as important 


ors in this type of think- 


by the committee: 
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Relevance: Is the student discriminating in his use of 
generalizations and facts? Are the generalizations 
which he uses relevant to the situation? 

Comprehensiveness: To what extent does the student 
see the implications of generalizations and facts? 
What range of important generalizations does he con- 
sider? Has he failed to use some of the important 
generalizations? 

Consistency: In the use of value or attitudinal principles 
does he show consistency in the point of view which 
he accepts? Does he use some principles which are 
conflicting either with each other or with the course 
of action or solution under consideration? 

Objectivity and Tenability: Does the student rely pri- 
marily on generalizations which can be substantiated 
by fact, or does he use slogans, emotional phrases, 
and clichés? Are the statements of facts and generali- 
zations used tenable in the sense that they do not 
contradict commonly known information? 


Selection of Problems 


The kinds of problems in which students may be expected 
to apply facts and generalizations which they have le 
were also explored. Again, teachers were asked to submit a 
list of problem areas dealt with in their classes, A list of 52 
problem areas was thus assembled. A considerable г 


types of problems was suggested. Some te 


arned 


ange of 


achers emphasized 
problems of personal-social relations; others were concerned 


a ан with so-called large social issues, The most fre- 

uently mentioned amon the latter were: c I - 
ox and advertising, earths distribution Poe P 
liberties, theories and forms of government, international re- 
lations, labor, natural resources, racial issues, profit system, 


public health, relief, taxes, housing, war and peace, unem- 
ployment, public opinion. 


alth, civil 
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CONSTRUCTION OF THs TEST ON THE ABILITY TO APPLY 
SoctaL VALUES 


The explorations described above determined in several 
ways the nature of the instruments that were developed. In 
the first place the analysis of the nature of generalizations 
indicated that there was a sufficient difference between the 
processes involved in the application of social values and 
those involved in the application of non-value generaliza- 
tions and facts to warrant the use of different evaluation 
techniques. Accordingly, two instruments were developed: 
one to deal with the application of value principles or demo- 
cratic tenets, the other to appraise the application of facts 
and explanatory generalizations. The first of these instru- 
ments, Social Problems (Form 141 and Form 142), was 
developed and studied more extensively and is, therefore, 
reported more completely in this chapter. The second, Ap- 
plication of Social Facts and Generalizations (Form 1.5), is 
reported briefly. 

Several suggestions reg 
tion of instruments were a 


arding techniques for the construc- 
Iso derived both from the analysis 
of the generalizations and of the behavior processes involved 
in their use. Thus, it seemed to be out of the question. to con- 
Struct exercises requiring students to respond to social gen- 
eralizations, particularly to value principles, as true and false, 
or right and wrong. It seemed more appropriate to require 
students to determine the logical relationships between con- 
clusions, courses of action, and certain generalizations and 
facts. The verv nature of the thinking process in this area 
indicated that the exercises should take the form of respond- 
ing to social values in the context of certain problems and 
issues, and not in isolation. Similarly, the criteria for apprais- 
ing the process of applying social generalizations, such as 
relevance, consistency, comprehensiveness, and pattern of 
values, determined, in a general way, the selection of the 
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issues to be included in the test, and the sampling of the 
specific items in the exercises. Thus, in order to appraise 
the consistency of value pattern it was necessary to include 
conflicting value principles in each of the exercises, Broadly 
speaking, then, the categories for the subsequent keying of 
the test items were determined by a jury of teachers. 

Naturally the analysis of the committee suggested only the 
main structure of the instrument. Additional criteria for the 
choice and formulation of the items in the test as well as for 
the choice of summary categories were developed according 
to what was revealed in the study of the results from the 
tentative forms of the instrument. 


The Choice of the Elements in the Test 


In the main it seemed necessary to provide a testing situa- 
tion in which the students would have an opportunity to take 
positions or to make decisions about some significant social 
issues and to support these decisions by using value princi- 


ples. Consequently the following structure for the test was 
eventually adopted: 


1. A problem situation describing an important issue 
was presented. 

Three courses of action representing three different 
positions toward the issue were formulated. The stu- 


dents were to choose the one or ones which they 
thought most desirable. 


. S. A list of “reasons” consisting of value principles was 


given from which students could choose the ones they 
would use to Support the course of action chosen. 
(See illustrative exercise, pp. 180-182.) 


2, 


As suggested in the analysis of this objective, certain cri- 
teria were set up for the choice of the content in each of the 
three parts of the test mentioned above. Thus, in order to be 
sure of providing Opportunity for applying value principles 
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and not just remembering them, the problems were to be 
new to the students. Since application to life problems was 
of concern to teachers, significant contemporary problems 
were chosen whenever possible; actual problems reported in 
newspapers or magazines were used. The fact that there are 
differences of opinion about the value generalizations sug- 
gested problems of controversial nature permitting several 
solutions or conclusions. In order to engage the effort of stu- 
dents, it seemed necessary to select problems which had 
some significance and meaning to them. Therefore, the tenta- 


tive formulations of the problems were submitted to students 


for their criticism and suggestions. 
Since solutions to social problems could not be considered 


as “right” or “wrong” in themselves, the courses of action 
outlined in the exercises represented different positions and 
were not to be marked as “right” or “wrong.” In order to pro- 
vide for a diagnosis of different value patterns, it seemed 
necessary for the courses of action to incorporate the posi- 
tions currently taken toward the issues described in the 
problem. А 

The kind of diagnosis that teachers were interested in 
making, expressed as criteria for evaluating this type of think- 
ing, suggested the main types of reasons to be included. 


Thus, in order to discover dominant value patterns, it seemed 


obvious that statements of contrasting beliefs and values 


were needed. In order to provide opportunities for students 
to engage in desirable, as well as undesirable, forms of rea- 


soning, it seemed necessary to include reasons which logi- 
cally supported each course of action, as well as those which 


were contradictory, irrelevant, or untenable. 


Preliminary Explorations of Test Forms 
In order to be sure that the proposed test, in addition to 


incorporating the desired diagnostic features, would be on 
a level appropriate to the students who were to take it—that 
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is would use terms they could understand and include the 
kinds of values they were familiar with, the types of unde- 
sirable reasoning they indulged in, and the kinds of value 
conflicts current among them—several tentative drafts of the 
test were tried out. 

Ten “direct form” exercises were drafted. Each contained a 
statement of a problem, and three courses of action. Students 
were asked to choose from these alternative courses of action 
or conclusions those which they approved and to write out 
their own reasons to support their choices, 


SAMPLE EXERCISE: 


Cotton Picker. Cotton has been picked by hand, which is a slow 
and expensive process. Recently, the Rust brothers invented a 
machine to do this work. It would pick in 7% hours as much cot- 
ton as one hand picker could pick over a whole season of eleven 
weeks, The cost of production of cotton could be reduced from 
$14.52 to $3.00 per bale. To date, this machine has not been 


placed on the market. What should be done with this machine? 


Solutions: (Check one or more which you think are desirable.) 


—A. The machine should be placed on the commercial market 
for immediate manufacture and sale, 

—B. The machine should be made available under some form 
of public control and provisions made for establishing in 
other jobs the cotton pickers who are thrown out of work, 


—C. The machine should not be put to use at the present time. 


Directions: Write in the space below the reasons which you 
T would use to Support the solution or Solutions you 
have checked. Be sure to write all of the reasons you 
can think of, 
Below is a sample of the reasons used by the students check- 
ing the course of action A: 4 


1. The normal trend of business would reemploy the re- 
placed workers gradually, 


2. The cotton workers could always go on temporary relief, 
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8. When a good invention like this has to be withheld from 
the market because of the problem of what to do with 
the unemployed, it is a little doubtful whether our pres- 
ent economic system is really serviceable. 

4. Socicty should not be deprived of anything that might 
improve work and the products it uses. 

5. Economic statistics prove that there is no such thing as 
technological unemployment. 


These student responses were used in several ways in 
drafting the instrument. In the first place, it was possible to 
check the usefulness and the validity of the criteria for sum- 
marizing and evaluating the responses suggested by the com- 
mittee. It was found that most of them—comprehensiveness, 
consistency, relevance, tenability, and patterns of values— 
Were useful in classifying and summarizing student re- 
sponses. Thus variations were found in the range of implica- 
tions seen ( comprehensiveness ). Often the reasons chosen 
by the students were in conflict with the courses of action 
they had marked (inconsistencies ). Many students used rea- 
Sons contrary to facts (untenable) or which did not apply to 
the courses of action they had chosen (irrelevant). Different 
value patterns were also expressed. These value patterns 
were at first summarized under the following headings: pro- 
tection of human values, consideration of general public 
welfare, democratic tenets, desire for justice, approval of 
change, protection of the economic interests of property 
Owners, protection of the interests of privileged groups, eco- 
nomic individualism, safeguarding of present institutions, 
laws, and customs. Because of the limitation in the length of 
the test, later it was necessary to reduce this classification to 
the following one: democratic values, undemocratic values, 
and rationalizations. In the second place, student responses 
also suggested the content for each variety of reasons. Thus, 
the types of untenable and irrelevant reasons to be used, the 
kinds of inconsistencies, and the kinds of democratic and 


180 ADVENTURE IN AMERICAN EDUCATION 


undemocratic values to be included were largely determined 
by analysis of these free responses. Suggestions were also 
found regarding terminology suitable for use in the test state- 
ments. The final form of the test included many statements 
made by the students. In other cases the phrasing as well as 
the content was patterned closely to the statements made by 
the students. 


Description of the Final Test 


A sample of the final test exercise with an example of some 
of the reasons is given below. The key is inscribed on the 
margin. 

PROBLEM IV. “WORKING CONDITIONS” 


Each year many workers have to stop working either temporarily 
or permanently because they develop poor lung conditions, ar- 
thritis, rheumatism, or just general ill health. It is known that 
such factors as dust, dampness, and unregulated temperature 
greatly contribute to these ailments, though it is impossible to 
determine in many individual cases to what extent the illne 


SS was 
caused by these conditions. 


Since it would involve costly improvements to eliminate these 
conditions, many mines and factories have done little about them 
and oppose further regulation. With the exception of a few states 
which have adequate health regulations 
things as hours of work and conditions 1 
regulated by the government. 


» at present only such 
eading to accidents are 


What should be done about such problems? 


Disections: Choose the most acce 
action and fill in the appropriate 
under Problem IV. 


Courses of Action: 


ptable course (or courses) of 
spaces on the answer shect 


(Undemocratic) A, It should be left to the individual mine and 


factory owners to determine what is 
needed and what they can afford, 


(Democratic) B. Minimum standards for general working 
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conditions, including all factors injurious 
to health, should be set by the government 
and all industries should be required to 
meet these standards. 

(Compromise) С. In industries where such conditions are 
likely to prevail, improvements should be 
made on the basis of suggestions from joint 
committees of workers and employers. 


What reasons would you use to support your course (or courses ) 
of action? 

Directions: Choose the reasons which are in harmony with what 
you believe and which you would use to support your course (or 
courses) of action and fill the spaces on the answer sheet in the 
column under the course of action you marked at the top. If you 
have chosen more than one course of action, and a reason sup- 


ports Loth, mark it in both columns. 


Reasons: 

Key , ' ' 
Supports A апа С 1. It would be unfair to require factories 
Inconsistent with B to introduce costly improvements 


which they feel they cannot afford. 

Without regulation, business can be 

depended upon to make necessary im- 

provements. 

G. If workers participate successfully in 
solving this problem, there is likely to 
be further cooperation between em- 


Rationalization 
Supports A and C 
Inconsistent with B 
Undemocratic Value 
Supports C 
Inconsistent with 


А and B. Democratic: 
Value ployers and employees. 


Human welfare should be protected 
ardless of the cost to industry. 


to 


Supports B 8. 
Inconsistent with A reg 
Irrelevant to C 
Democratic Value 
Supports A and C 10. 
Inconsistent with B 
Undemocratic Value 


Since employers have to bear the ex- 
ense of making improvements in 
working conditions, they should have 
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Key 
a voice in deciding what changes 
should be made. 
Untenable 12. Most industries today provide as 
healthy working conditions as they can 
afford without undue strain on their 


finances. 
Supports A 15. If a worker is willing to accept em- 
Inconsistent with ployment in an industry, he should 


B and C. Undemo- 

cratic Value 

Supports A and C 16. 
Inconsistent with B 
Rationalization 


be willing to work under the condi- 
tions prevailing in that industry. 
Even though it is important to im- 
prove working conditions, it is un- 
democratic to accomplish this through 
dictation by the government. 
. In the past improvements in working 
conditions have come only under gov- 
ernment compulsion. 


Untenable 20 


A word of explanation may be necessary regarding the 
method of arriving at the key for this instrument. The anal- 
ysis made by the committee suggested the classification of all 
items, except the specific diagnosis of the value pattern. This 
was developed by an analysis of responses and was checked 
by teachers. The items were keyed by a jury composed of 
members of the Evaluation Staff and some teachers of social 
sciences, 

On the assumption that value preferences and logical judg- 
ments both enter into and influence each other in the normal 
life response to controversial social issues, evaluation proc- 
esses should not isolate these behaviors and treat them as if 
they occurred independently of each other. Hence, the test 


is not made up of parts corresponding to each of the aspects 
of behavior measured by the test. 


Only one process of marking the 
dents mark the reasons which the 
courses of action they chose. Th 


test is employed: the stu- 
y would use to support the 
е students use each reason 
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only once with each course of action. But each reason is 
keyed in several different ways. Thus, reason 1 in the above 
exercise supports courses of action À and C and is incon- 
sistent with B. Depending on the course of action with which 
it is used, response to this reason is scored under the accu- 
rate reasons contributing to comprehensiveness, or under 
inconsistency. In addition, each exercise contains two or three 
reasons which are contrary to commonly known facts, i.e., 
are untenable (reasons 12, 20). These reasons are not keyed 
to any particular course of action, but are so sampled that 
for each position there is one untenable reason which has 
some logical relationship to it. They are scored as untenable 
no matter with which course of action they are used. 

Most of the reasons are also keyed to represent value posi- 
all courses of action. The value patterns are 


tions, as are 
(1) democratic? representing 


grouped into three categories: 
defense of the interests of the general public or general wel- 
fare, of such democratic rights as freedom of speech, equality 
of opportunity, and a decent standard of living, of rights of 
minorities and other underprivileged groups (course of ac- 
tion B, reasons 2, 6, 8); (2) undemocratic, representing pro- 
supremacy of efficiency and eco- 
eeds and values, undemocratic 


procedures, or discrimination (course of action A, reasons 10, 
15); (3) compromise, representing essentially an effort to 
reconcile these two types of values (only courses of action, 
e.g. C, are used). Rationalizations (reasons 1, 16), repre- 
senting undemocratic values stated as democratic slogans, 
are keyed and scored as a separate value category, but they 
can be used to support either the undemocratic or com- 
promise courses of action. At least six supporting reasons are 
available for each course of action. The logically sound sup- 


tection of special privilege, 
nomic gain over human n 


ad “undemocratic” as used in 


"TI i t “democratic” ar 
+ - he erms ] 
e meaning of tl hat more encompassing than 


this test have thus a special definition, somew 
the common usage of these terms. 
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port for the democratic course of action is composed exclu- 
sively of democratic values. Those supporting the undemo- 
cratic course of action are all undemocratic values. About 
half of the supporting reasons for the compromise course of 
action are democratic and half are undemocratic values. No 
matter with which course of action the reason is used, it is 
keyed to the same value. Thus reason 1 is keyed as a demo- 
cratic value and reason 2 as an undemocratic value, inde- 
pendently of the course of action with which they are used. 

In the entire test there are eight of these exercises, cover- 
ing such problems as conservation of national resources, free 
speech, unemployment, protection of health, distribution of 
wealth, collective bargaining, and socialized medicine. The 


pattern of reasons described above is the same in all eight 
exercises. 


Summarizing and Interpreting the Results 

On the sample form of a data sheet shown on page 185 the 
scores for four students are presented for purposes of illustra- 
tion. At the bottom of the data sheet the maximum possible 
score, the highest score, lowest Score, and the group median 
are recorded for each column. All of these are computed for 
the class of 53 students from which these four were drawn. 

Scores on this test can be interpreted in terms of answers 
to three questions. The first of these questions is: How 


broadly does the pupil relate principles or value generaliza- 
tions to chosen courses of action? 


a я VA { 

Comprehensiveness (columns 1, 2, 3, 4). The most impor- 
tant score here is found in the column headed Ratio (column 
r of logically accurate 
ach course of action, A 
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D has used on the average only 2.3 reasons for each course 
of action that he chose—a ratio score which is considerably 
below the median. This suggests that Student A has a much 
broader vision of the implications of social values than docs 
Student D. The scores on total reasons (column 2) and ac- 
curate reasons (column 8) are for purposes of reference only. 
Thus occasionally it is important to see whether a student 
has marked many reasons in excess of those needed to sup- 
port his position. This would suggest that the student is con- 
fused or lacks discrimination, which, for instance, is the case 
with Students B and C. Each has used over 20 reasons which 
do not support the courses of action he chose. In the case of 
Student B, these constitute over half of the total reasons 
marked. 

The second question is: To what extent does the pupil 
show lack of logical discrimination in the use of reasons to 
support the courses of action which he chooses? 

Undesirable Reasons (columns 5, 6, 7). 

Per Cent Inconsistency (column 5). This score gives the 
per cent of the total number of reasons checked by the stu- 
dent which are inconsistent with the course of action chosen. 
A high score here indicates inability to see clearly the logical 
relations between value principles and social issues, As such, 
it is an index either of lack of ability to deal with abstract 
principles or else of a confused value pattern which makes 
it impossible to see their implications clearly. Student D has 
ayoided ali inconsistencies, while 28 per cent of the reasons 
marked by Student B were inconsistent with the courses of 
action chosen, the median for the class for inconsistency 
being 5. 

Untenable Reasons (column 6). This score gives the num- 
ber of reasons checked by the student which are contrary to 
commonly known facts, A high score here indicates either a 
tendency to use questionable evidence to support one’s posi- 
tion, or it expresses idealistic naiveté and goodwill toward 
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social conditions and a lack of awareness of the real condi- 
tions. Student C uses eight such reasons, while Student D 
uses only two. It must be observed, however, that the range 
for this score is small. 

Irrelevant Reasons (column 7). This score gives the num- 
ber of reasons checked by the student which do not apply to 
the particular course of action chosen. A high score here sug- 
gests lack of discrimination between reasons that are relevant 
and those which do not apply to a given course of action. 
Students A and C show higher than average tendency to fail 
to discriminate between the relevant and irrelevant reasons, 
while Student D has marked only one irrelevant reason. 

The third of these questions is: What values are dominant 
among the courses of action and reasons chosen by the stu- 
dent? 

While the choices of courses of action as well as of reasons 
yield information on patterns of value, the former are used 
only in a subsidiary fashion to determine the consistency of 
the pattern. The scores on reasons (columns 11 to 14) are of 
Primary importance here. Those on courses of action can be 
used only as supplementary evidence. The main score on 
dominant values is the per cent democratic values (column 
14). A high score here indicates a clear-cut and exclusive 
acceptance of the democratic values as defined above (p. 
183). One hundred per cent of the values used by Student D 
are democratic, while only 22 per cent of the value reasons 


used by Student B fall in this category. | | 
Columns 11, 12, and 13 represent а more specific analysis 


of the distribution of value scores. 

Democratic. Scores in column 11 report the number of 
times the student has used reasons expressing the values of 
general welfare and democratic rights. Student A uses a large 
number (43) of values of this type, while Student B has a 
score of 6. which is at the bottom of distribution for the class. 

Undemocratic. Scores in column 12 give the number of 
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reasons which express defense of the interests of special 
privilege of all sorts. A high score here indicates a predom- 
inant acceptance of undemocratic viewpoints on social issues, 
as defined above (p. 183). Student B has used 13 of this type 
of value statements. This is not only considerably above the 
median but also this type of value composes the largest part 
of the total value reasons used by him. 

Rationalization. (Scores in column 13). Included under this 
heading are reasons which attempt to rationalize an essen- 
tially undemocratic viewpoint by couching it in democratic 
terminology. High scores here indicate a tendency of gulli- 
bility to slogans and an inclination to pay lip service to demo- 
cratic generalities. Student C shows such an inclination, 
having used more than the average of these reasons. 

Sometimes it is worth while to compare the values ex- 
pressed in choices of courses of action with the value pattern 
revealed in reasons. Often these two aspects of reasoning are 
not consistent with each other. Thus if the majority of the 
reasons checked by the student are democratic values but 
several undemocratic or compromise courses of action are 
chosen at the same time, one may infer that the student does 
not fully see the implications of the values he accepts ver- 
bally. Such seems to be the case with Student D. He has 
chosen two compromise courses of action, which normally 
call for part democratic, part undemocratic support, yet he 
has used no undemocratic values among his supporting 
reasons. 

In the foregoing explanation each of the.scores was con- 
sidered independently. This is normally the first step in in- 
terpretation. Since each of the single scores describes only 
one part of a pattern, it is not justifiable to draw conclusions 
about an individual without considering the whole pattern 
of scores. Tn such a pattern, a score often assumes a meaning 
which differs from the one gained from considering it by it- 
self. In attempting a pattern interpretation it is useful to 
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consider these scores in two groups: one representing the 
logical aspects (comprehensiveness, consistency, tenability, 
relevance: columns 1 to 7), the other representing the pat- 
tern of values (democratic, undemocratic, rationalization; 
columns 8 to 14). However, in addition to examining the 
r group in relation to each other, it is also 


Scores in cach majo 
aspects in the light of the 


necessary to consider the logical 
value pattern and vice versa. 
Student A, for instance, tends to be comprehensive in his 


use of reasons. At the same time he is somewhat lacking in 


logical discrimination, as shown by his tendency to accept 
inconsistent ard irrelevant statements in supporting the 
Courses of action. Since his dominant value pattern is demo- 
cratic in a clear-cut way, one is led to infer that his main 
difficulty is weakness in logical discrimination. 

Student C shows confusion both in the logical aspects 


(relatively high inconsistency) and in his value pattern as 


Shown by his frequent choice of compromise courses of ac- 
tion and of rationalizations. One might infer from this that 
his difficulties with logical aspects of applying values stem 
from the confusion of the values he accepts. His scores on 
democratic and undemocratic values are rather evenly di- 
vided and a high score on rationalizations suggests gullibility 
to democratic slogans. When one considers in addition the 
fact that he uses only a few supporting reasons, one is forced 
to describe the whole picture as that of a lack of awareness 
of, and confusion about, social issues. | | 

A high degree of inconsistency is one of the major facts 
about Student B. But because his value pattern tends rather 
clearly toward the undemocratic one, one 1$ forced to con- 
clude that his main difficulty is that of misapprehension of 
logical relationships between reasons and courses E action, 

The patterns of reasoning illustrated above ks vd re~ 
currently among students. Some students may a Pu in 
their reasoning and at the same time consistent, discriminat- 
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ing, and have a clear value pattern. Others may be broad, 
but inconsistent and ambivalent in value pattern. Some are 
narrow, clear, and have a democratic value pattern. Others 
may be ambivalent in their values, but not inconsistent. This 
usually happens when they take different positions regarding 
the different issues included in the test but are not confused 
as far as the same issue is concerned. For teachers interested 
in diagnosis of the kinds of thinking students do and of the 
ways their value patterns either help or hinder that thinking, 
this is useful information. 


VALIDITY AND RELIABILITY 


The usefulness of this instrument, as of any instrument, is 
determined by (1) how adequately it measures what it sets 
out to measure (validity) and (2) how reliable a particular 
set of the students’ responses is likely to be. The problem 
of validity is a complex one and includes the consideration 
of the validity of the instrument itself, as well as of the con- 
ditions under which the test is given and taken. In this sec- 
tion attention is devoted to the discussion of the validity of 
the instrument itself. The conditions under which valid re- 
sults are possible in a given situation will be discussed in the 
section on uses. 

The validity of the results from a test of this type is deter- 
mined by several factors. In the first place, there may be a 
difference between the behavior specified in analysis and the 
behavior actually measured by the test. Any test situation is 
an artificial situation and may introduce difficulties irrelevant 
to its purpose. Hence, it is important to see what correspond- 
ence there is between the evidence from the test and that 
obtained from freer and more natural situations. 

Each test also employs a certain method of scoring and 
summarizing. This method may not give the most adequate 
picture of the responses to the test and therefore it is neces- 


APPRAISING STUDENT PROGRESS 191 


sary to determine how effective the method of scoring and 
summarizing is. 

Finally, there is always the question of the degree to which 
general ability affects success with a given test. This test does 
not purport to be a measure of general intelligence. There- 
fore, some evidence is needed to determine the relation of 
this factor to the responses to this instrument. 

Some evidence was secured on all of these points in the 
course of the study. Serious effort was made in the process of 
constructing the test to assure as great a degree of validity as 
possible. Throughout the process of construction steps were 


taken to make sure that the test appraised the behaviors it 


was intended to appraise. As was indicated in the description 


of the preliminary analysis and of the exploratory studies, 
care was taken to see to it that the behavior measured as 
well as the content of the exercises was appropriate to the 
students who were to be tested and consistent with the ob- 
jectives and curriculum emphasis of the schools. The prob- 
lems and generalizations included in the test were chosen 
according to what was found to be most widely emphasized 
in the schools intending to use the test. Student responses to 
€ssay forms were examined to secure reasons representing 
the types of values and patterns of reasoning current among 
the students. In addition, tentative drafts of the more objec- 
tive forms ‘were tried out and revisions were made on the 
basis в t 

"e ini were conducted to develop the most 
Useful categories of summary and methods of scoring. The 
initial choice of the summary categories was made according 
to the suggestions made by the онаша These hene m 
Out experimentally, and revisions and С = es та И 
according to the dependability and usefu 28 o ME 
ticular scores as shown by experimental we of the in - 
for instance, some of the rather fine classifications of values 


attempted at first proved impracticable because the test 
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could not be made long enough to get high reliability on 
these scores. 

The validity of the diagnostic descriptions of students 
made from the test scores was also checked informally 
throughout the Study. In each school where the test was 
given conferences were held with the faculty. Students se- 
lected by the faculty were described on the basis of the test 
scores and these descriptions were submitted to the collec- 
tive judgment of the faculty. Usually students who were 
known by most teachers, and intimately known by some, 
were chosen for this purpose. This was done in about 25 of 
the Thirty Schools, and descriptions of several hundred stu- 
dents were thus examined and checked in the course of two 
years. Outright disagreements on major points were rare. 
These occurred mostly in cases where the observations of 
different teachers varied considerably. 

Certain difficulties were experienced in the use of the usual 
statistical techniques for estimating validity and reliability. 
The scores describing the logical aspects and those describing 
the value judgments are both derived from a single process 
of marking by the student. Each aspect influences the other, 
however, and interpretation must account for this interrela- 
tionship. Thus a high score on comprehensiveness combined 
with high consistency means one thing. The same score on 
comprehensiveness combined with high inconsistency means 
something different. 

However, statistical techniques which are simple enough 
fór practical purposes in an exploratory study such as this 
one do not permit the treatment of the validity and relia- 
bility data in terms of a pattern of scores. They usually are 
predicated on the assumption that each score is a separate 
entity. Hence it is felt that the quantitative evidence pre- 
sented in substantiating the claims for a certain degree of 
validity and reliability of the instrument do not.do full justice 
to it. 


Validity was investigated by the following three methods: 


"y —— 


APPRAISING STUDENT PROGRESS. 193 


(1) comparison of teacher observations with test scores, (2) 


comparison of interviews with students with the test mate- 
rials, (3) correlation of the scores on this test with scores on 
psychological tests. 

The comparison of teacher observations with the test re- 
sults was employed with the full recognition of the fact that 
the opportunities for teachers to observe these particular 
characteristics were apt to be deficient and hence not fully 
reliable. In three schools a selected group of teachers was 
asked to rate a group of senior students separately on the 
three major characteristics diagnosed in the test: comprehen- 
Siveness in seeing implications of social values, consistency 
of their social reasoning, and the pattern of social values. 
Altogether, 132 students from three schools were thus rated. 
From five to eight teachers in each school participated, with 
an average of four teachers rating each student. A three- 
point scale (1—high, 2—average, 3—low) was used for each 
of the characteristics. These ratings were then compared with 
the corresponding test ratings. The results are presented in 


the table below. 
ONTINGENCY CORRELATIONS 


MEAN SQUARE с 
ATINGS AND TEST RATINGS 


OF TEACHER RA 


Compre- Consistency M cn 
hensiveness alues 
| -——* 
School I 49 .63 .38 
School II 50 а ү, 
Sch ; à " 
chool III 78 ; 29 .88 


One teacher in School I 


on the whole, there is a general 
' ratings and test ratings. All cor- 
relations: are positive and with three exceptions are .50 or 


higher. The highest relationships were found in School I, in 
which the teachers participating in the rating had the best 


‚ 

These data suggest that, 

agreement between teachers 
Е 
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opportunities to observe their students. The ratings of the 
student adviser in the same school have the highest corre- 
spondence with test scores. Thus the relationship between 
the test and the teacher ratings seems to increase as the con- 
ditions necessary for reliable teacher rating improve. This 
would suggest that the reliability of teacher ratings is a 
strong factor in limiting the correspondence. It should also 
be remembered that while in the normal process of inter- 
preting the results of this test the meaning of a single score 
is often altered in the light of the whole pattern of scores, 
single scores were used in the statistical processes of com- 
puting the correlations. Hence, the coefficients expressing 
the correspondence are apt to be lower than would have 
been the case had it been possible to use all scores in rela- 
tionship to each other. However, in Spite of these difficulties, 
these data suggest that when thoughtful judgments are made 
by teachers who have had adequate opportunity to observe 
students' social thinking, a rather close agreement is likely to 
occur. These data are also in accord with the hypothesis that 
under usual classroom conditions teachers would be able to 
identify most of the extreme cases without the test, but that 
close agreement throughout between the test and teacher 
rating would not be found, since teachers ordinarily do not 
have a very adequate basis for observing these particular 
qualities and hence for rating them very precisely. 

Another method used was that of interviewing the stu- 
dents. Forty-five students, 15 from each of three schools, 
were interviewed. Their specific responses to the test items 
were first analyzed and summarized in a written statement. 
The students were then interviewed regarding their view- 
points on social issues included in the test. Through a series 
of questions, the students were led to comment on the kinds 
of solutions they approved and the reasons why they thought 
these solutions were appropriate. Verbatim records of these 
interviews were taken. The itemized analysis of the test re- 
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sponses and interview records were then submitted to four 
judges, all of whom were familiar with what the test was 


ure. These judges were first asked to indi- 


attempting to meas 
cate the extent of agreement between what the students said 
rked each exercise in 


in the interview and how they had ma 
the test. This agreement was rated on а three-point scale: 1— 
good, 2—fair, 3— poor. An average rating for the degree of 
agreement for each student throughout the test was com- 
pounded by adding the values of all judges ratings on all 
exercises and bv dividing this total by the number of ratings. 
In most cases the agreement was found to be high. Thus, 
the mean rating on all students on all problems was 1.29, 
indicating only slightly less than “good” correspondence in 
the majority of cases. The lowest average rating on any stu- 
dent was slightly better than “fair” (1.78). The number of 
“good” ratings represented 75 per cent of the total number 
of ratings, while the number of cases of poor correspondence 
represented 3 per cent of the total ratings. Thus it a apparent 
that these judges considered the interview materials е be 
highly consistent with the test am e is ы = 
gratifying in view of the fact that oo | stu agen em 
a change of viewpoint between the taking © est an 


the intervi 
view. | | 
x <ed to consider the inter- 
Thre j s were then asked 
hree of the judge des peat 


view materials alone and to rate eac | 
s e he test: comprehensiveness, consistency, 


pects measured in t i : 
and pattern of values, on а a adi scale (high, average, 
low), in order to get some evidence of the adequacy s the 
> © ; rati jere corre 
summarization and scoring. qus wA ч x itl ia 
with the test ratings ОП the correspon ing score J with E 
followi — (expressed as product-momen correla- 
wing results ie istency .51, democratic 
tions): comprehensiveness .59, consis 1 gere 
ё у N 
value .66. Considering the meagerness of the interview ma 
P = fact that the interviews 


terials for rating purposes and the 
wers susci T topics similar to but not the same as the 


196 ADVENTURE IN AMERICAN EDUCATION 


test exercises, and taking account of the difficulty involved 
in treating the test scores in isolation from each other, it is 
justifiable to assume that the method of scoring and sum- 
marizing represents student responses to the test fairly ade- 
quately, 

In order to see to what degree general intelligence is re- 
lated to the results on this test, the scores on the American 
Council Psychological Examination for 45 students were 
correlated with the three main scores on this test. The rela- 
tionship was found to be low on all three; namely, compre- 
hensiveness .27, consistency .35, democratic values .04. The 
number of students is too small to afford conclusive evidence, 
but there is a fair indication that the performance on this test 
is relatively independent of the abilities measured by the 
psychological examination. ` 

Several checks were also made of various aspects of relia- 
bility. The stability of scores was tested by several methods 
of estimating reliability. The split-half method was used on 
scores which permitted such treatment. The Kuder-Richard- 
son formula was used wherever the split-half method did not 
apply? The estimated reliability for the score on per cent 
democratic values was obtained by correlating Forms 1.41 
and 1.42 of the test. The coefficients of correlation secured 
froma sample of 600 students in tenth, eleventh, and twelfth 
grades range from .50 (untenable) to .91 (democratic 
values). 

, On the chief scores used in interpreting the results (com- 
prehensiveness ratio, per cent inconsistency, number demo- 
cratic values, number undemocratic values, per cent demo- 
cratic values), the reliabilities range from .70 to .91, which 
may be considered fairly high for a test of this type, par- 
ticularly since the final judgment of the students’ behavior is 
based on a pattern of scores and does not depend exclusively 

? Loc. cit. ! 

1° See Appendix for а complete table of reliability coefficients by grades. 
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on any one single score. Low reliabilities were found on the 
scores on untenable reasons (.50) and rationalizations (.67). 

These data seem to indicate that this test has sufficient 
validity and reliability to be a useful instrument for diag- 


nosis. It must be remembered that the behavior measured in 


this test is highly complex, affected by variability in the in- 
terpretation of test statements and by emotionalized re- 
sponses. Hence, objective tests in this area probably cannot 
be judged by the same criteria as are applied, for instance, 
to tests measuring achievement in acquiring information. It 
is also likely that under optimum conditions, where teachers 
have worked seriously on this objective, and students are 
familiar with the type of reasoning and the kind of content 
involved, both the reliability and validity estimates might be 
higher. 


ArrpLyixc SocrAL Facts ANP GENERALIZATIONS TO Socian 
PnosLeMs (FORM 15) 


As was pointed out above, teachers of the social studies 
Were concerned with students ability to apply not only value 
judgments but also relevant and accurate information in their 
analysis of social problems. An instrument developed to get 
evidence of the latter ability will be described Ье since 
the processes involved in its construction Were analogous to 
those reported at length in the preceding section. 


Analysis of the Objective 
objective resulted in the following list 


f behavior to be evaluated: (1) The 
ability to see the logical relations between general principles 
and specific information on the one hand and the issues ins 
volved in a given social problem on the other; i.e., to see 
Whether a Meer supports contradicts, OY 15 irrelevant to 
a conclusion. (2) The ability to evaluate arguments pre- 
sented in discussing а SP€€ 1 problem, and in par- 


The analysis of the 
of important types О 


ific socia 
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ticular, to discriminate between statements of verifiable fact, 
statements of opinion and common misconceptions. (3) The 
ability to judge the consistency of social policies with social 
goals; i.e., to judge the appropriateness of certain social poli- 
cies for achieving certain social aims. 

There are two major types of situations in which individ- 
uals make use of these abilities: (1) when one evaluates a 
proposed solution of any social problem, and (2) when one 
proposes a solution and tries to support it. The test described 
below is based upon the first type. These situations occur in 
the consideration of a wide variety of problems, involving 
many types of generalizations and of factual information. 
Before any instruments could be developed in this field, it 
was necessary to make a choice of problem areas and types 
of generalizations to be sampled. The list of social science 
generalizations and of significant problem areas submitted 
by the teachers and discussed above was used as the primary 
source of issues upon which to build the test. These were 
checked further with respect to the frequency with which 
they occurred in high school courses on social problems. The 
following problem areas were selected: consumer buying, 
health, unemployment, housing, soil conservation, civil liber- 
ties, international relations, taxation, and civil service. 


Description of the Instrument 


Exercises were constructed for each of the problem areas 
listed above. Each exercise is a complete test in itself and can 
Ее used independently of the others. An exercise is composed 
of several parts, constructed in such a way as to give evi- 
dence of the three abilities listed in the analysis of the ob- 
jective. In the first part of the exercise 
described, and one of the frequently 
indicated. Various statements ( 
tradicting, and some irrelevant ) 


“See р, 170 


a social problem is 
suggested solutions is 
some supporting, some con- 
concerning the solution are 
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presented. The student is asked to indicate whether each 
statement supports, contradicts, or is irrelevant to the sug- 
gested solution. A student's reactions to this part of the test 
are summarized in terms of the number of accurate responses 
he makes, the number of times he confuses supporting and 
contradictory statements, and the number of times he fails to 
see the relevance of a statement to the conclusion. The state- 
ments include basic assumptions, general principles, accurate 
information, and common misconceptions. In the second part 
of the test the student is asked to indicate whether each of 
the statements can be proved to be either true or false. The 
student’s reactions to this section are summarized in terms 
of the number of times he discriminates between statements 
of fact and assumptions, the number of times he marks value 
judgments as verifiable, the number of times he marks state- 
ments of fact as not verifiable, and the number of times he 
discriminates accurately between true statements and com- 
mon misconceptions. An excerpt from one exercise is given 
below. The key is indicated at the left of each statement. 
HOUSING? 
Form I 


Applicati 
pplizaian of (Tentative Draft) 


Principles 1.5 
Problem: 
Housing is one of the problems of concern today. Many 
E 5 : | 
schemes have been suggested as а means of improving housing 
conditions. In general, there are two major ways in which govern- 
.Ing 4 s 
ment can aid in solving this problem: (1) by setting standards 
for and regulating the construction of private housing, and (2) 
o i " f 
by building houses at public expense, contributing either part or 
all of the funds necessary. Each method has certain advantages 
and disadvantages. Nevertheless, many people believe that the 
oe oe 
government should build houses at public expense to rent to those 
sections of the population with the lowest incomes. 
"Tn all cases where the phrase “decent house" or its equivalent is used, 


it is to be defined as a separate house or apartment for each family with 
running water, inside bath, fire protection, and enough room for privacy. 
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І. Directions: For each of the following statements, place a check 
mark (V) in one of the columns labeled Part I. Place the 
check mark (\/) opposite the number which corresponds to 
the number of the statement in: 


Column A if the statement may logically be used to support 
the underlined conclusion. 


Column B if the statement may logically be used to contradict 
the underlined conclusion. 


Column C if the statement neither supports nor contradicts 
the underlined conclusion. 


Check each item in only one column. In case of doubt, give the 
answer which seems most nearly right. 


In this part of the exercise, assume that each statement is true. 


Supports 1. Whenever houses are not available to the 


Assumption 


public, society should assume the responsi- 
bility for making it possible for everyone to 
have a decent place to live. 


Contradicts 8. Government-built houses are more expensive 

Misconception to construct than comparable houses built by 
private companies. 

Supports ll. It has been demonstrated that the federal 

Misconception government can build adequate houses for 
the lowest income group cheaply enough so 
that they can be paid for out of income from 
rent. 

Contradicts 14. Individuals who have heavy investments in 

Accurate slum property would probably suffer heavy 

t losses if a broad program of federal housing 

went into effect. 

Contradicts 17. The system of private initiative in business 

Assumption should not be jeopardized by the socialization 
of any of the fundamental industries. 

Supports 20. Under present conditions, at least 50 per cent 

Accurate 


of the people cannot easily afford to own a 
decent home; at least one-third of the popula- 
tion cannot afford to rent decent homes. 


——MÀ——— 
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Irrelevant 22, Comparable houses can frequently be rented 

Accurate in the suburbs for somewhat lower rentals 

than in the city. 

П. Directions: Go back over the statements. In the columns 
labeled Part II place a check mark (V) opposite the number 
which corresponds to the number of the statement in: 

Column D if you believe that the statement can be proved to 
be true. 

Column E if you believe that th 
be false. 

Column F if you believe the statement 
either true or false. 


Check each item in only one © 
what seems to you to be the оп 


е statement can be proved to 
cannot be proved to be 


olumn. In case of doubt, give 
e best answer. 


go on to Part III. 


ical analysis and to 
y accurately and yet may not be 
= ial policy is likely to 
herefore, in the third part 


A student may be a 
evaluate the argument ver 
able to judge whether or not 2 5 
achieve a given social objective. al 


a the test the student is give? opportuni 
or jud i of the test consis 
judgment. This part E a improvement atthe 


à particular social objecti 
ar social objective ‹ А 
housing conditions of the third of the population with the 
lowest income), and several proposals, some appropriate, 
y objective. The student 


some inappropriate, for achieving this 
ppropriate, 


i inks would be ef- 
is ask RA ich pro osals he thin | 
ed to indicate whic жэ His reactions to this sec- 


fecti i 

ive in achieving the О 

tion of the test ате summarized in terms the iae m 
times he chooses policies which are helpful in achieving the 


Stated objective. x 
An illustration of this part 


ПІ Dir i lab 
; ections::In the column 
ber which corresponds 10 the number of th 


of the test is given below. 


cled Part Ш opposite the num- 
e statement, write: 
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A plus sign (+) if it expresses a type of action which you 
think would improve the housing conditions of that third of 
the population with the lowest incomes. 


A zero sign (0) if it does not express a type of action which 
you think would improve the housing conditions of that third 
of the population with the lowest incomes. 
+ 1. New buildings should be required to measure up to higher 
minimum standards for construction. 
+ 2. Credit for housing should be supplied in larger quantities 
and at lower rates of interest. 
8. All city land should be reassessed. 
4. Laws should be passed requiring the destruction of all 
slum areas, 
+ 5. The government should subsidize housing for lower in- 
come groups. 


oo 


Accurate response to each of the first three steps involves 
the use of certain general information. In case the student 
makes a large number of inaccurate responses, it is impor- 
tant to know whether it is because he does not have the in- 
formation or whether he knows the facts of the situation but 
cannot apply them. Therefore, in the last section of the test 
the student is asked to judge the truth or falsity of a series of 
statements which sample the information that is assumed in 
the preceding sections of the test. 

A sample of the factual statements in this section of the 
test which correspond to the arguments used in the illustra- 
tion of Part I is given below: 

Directions: Form II. The following items refer to the problem of 
housing. In the columns labeled Form II place a check mark 
(V) opposite the number which corresponds to the number of 
the statement, in: 

Column X if you believe the statemen to be true. 

Column Y if you believe the statement to be false. 


C н Z if you are uncertain whether the statement is true or 
false. 
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True 1. At present various estimates agree that at least one- 
third of the population lives in unsanitary or un- 


healthy homes. 
False 3. On the average, the cost of federal housing has been 


approximately $1,000 more per unit than the cost of 
comparable private construction. 

False 11. To date the income from rent on housing projects 
has been large enough to pay for the original cost of 
the investment in a relatively short time. 

False 14. Government competition in the construction of low- 
cost housing would probably not affect the value of 
slum property. 

True 17. In the past, housing has been one of the largest pri- 
vate industries in the United States. 

True 20. More than 50 per cent of the families in the United 

States have an annual income of $1,800 or less; while 

at the same time over three-fourths of the houses 

built in the last five years were built to be sold for 
over $4,000. 

Statistical studies show that cost of living is as high 


False 9 
in suburban areas as in the metropolitan districts. 


to 


Reactions to these statements are summarized in terms of 
the number of accurate, inaccurate, and uncertain responses. 
These scores are used primarily for aiding the interpretation 
of scores on the first two sections of the test. 


EVALUATION OF SOCIAL ATTITUDES 


ANALYSIS OF THE OBJECTIVE : 


The study of social attitudes has been 
ican psychologists and sociologists for a long time. The litera- 
ture on this subject, however, reveals a great diversity of 
Opinion regarding the proper delimitation of the behaviors 
to be called “attitudes” and the terminology to be used in 
denoting that behavior. Similar diversities also prevail in the 
conceptions of the important characteristics of “attitudes 


of concern to Amer- 
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and in the techniques employed in measuring these charac- 
teristics. 

The difficulties with the definition and classification of at- 
titudes soon became apparent as the schools began apprais- 
ing social attitudes. While the development of social attitudes 
was one of the most widely emphasized objectives among 
the schools in the Eight-Year Study, there seemed to be little 
clarity regarding the kind of behavior this objective involved 
and the significant areas in which it was important to develop 
and appraise social attitudes. 

Analysis of Behavior 

The initial statements from the schools revealed that many 
diverse types of behavior were considered to be social atti- 
tudes. Thus, some mathematics teachers submitted the ability 
to see quantitative relationships as an illustration of an atti- 
tude. Willingness to make an effort to express oneself clearly 
was one of the attitudes suggested by English teachers. Often 
objectives which seemed more closely related to interests 
and appreciations were included in this classification. Such 
personal qualities as resourcefulness, initiative in school work, 
and open-mindedness about the ideas of other people, along 
with beliefs about a wide range of social issues, were sug- 
gested in the statements of objectives submitted by the 
schools. Ё 

Recognizing the difficulties arising from the lack of clarity 
as to what kinds of behavior could be classified as attitudes 
end the diversity of objects toward which the suggested atti- 
tudes were directed, the committee on social attitudes pro- 
ceeded along two major lines of analysis. It attempted (1) 
to describe the nature of social attitudes sufficiently to dis- 
tinguish them from other school objectives, such as interests 
agens e) dana бе major are w 
school were usually deae y m ken om a 

- In doing this the committee 
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recognized that it could not solve the problem of defining 
and classifying attitudes in a comprehensive fashion. Since 
the committee was concerned with evaluation, it tried to 
identify only those aspects of social attitudes which consti- 
tuted important objectives of the schools. 

From this viewpoint the following distinguishing charac- 
teristics of attitudes were identified: 

1. An attitude may involve a feelingtone of acceptance or 
rejection. This feelingtone may be evoked by an idea, a per- 
son, a way of behaving, or a mode of doing things. Thus one 
may like or dislike a person; reject or accept authoritarian 
methods; be afraid of or feel at home with members of the 
anners, or novel experiences. Attitudes 
ctly expressed in immediate be- 
attitude" may not neces- 
the person concerned. 


other sex, strange m 
of this sort are rather dire 
havior and the possession of “an 


sarily be consciously recognized by 
2. To have a belief about, or an opinion about, or to take 


position toward an issue, value, or institution may be con- 
Sidered another type of attitude. Thus one may approve of 
equality for Negroes, be for or against religion, disapprove 
of government control, believe in the efficacy of democratic 
Processes, or be opposed to war. Though beliefs of this sort 
are not always arrived at by rational processes, they usually 
involve a conscious intellectual recognition that a position is 
being taken. 

8. Often attitudes represent a latent tendency to act, such 
as the disposition to be kindly and considerate toward aliens, 
to defend the rights of minorities, or to proceed democrat- 
ically in managing student government. Presumably these 
tendencies prevail as a result of conscious beliefs. However, 
this does not mean that there is of necessity a consistent rela- 
tionship between what one believes and the character of 
Overt action. Overt behavior may often be inconsistent with 
one’s conscious beliefs, or it may expres or imply value posi- 
tions not consciously recognized as such by the individual. 
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Thus one may express prejudices toward certain ideas and 
values in one’s daily behavior without reflecting upon the 
implications of these actions or without recognizing the be- 
liefs which may have motivated them. 

The problem of distinguishing the ways in which attitudes 
and social beliefs could be expressed was of major impor- 
tance for purposes of evaluation, since these distinctions 
would largely determine the techniques to be used in ap- 
praisal. For this reason the relationship between “beliefs 
about” or “feeling toward” and overt action was discussed at 
length by the committee. Considered from the standpoint of 
the techniques to be used in appraisal of attitudes, the lists 
of specific attitudes submitted by the teachers suggested 
three groupings. Some of these objectives referred to atti- 
tudes pertaining to immediate social relations, such as co- 
operation and respect for others. The schools were concerned 
with attitudes of this sort primarily as expressed in some form 
of overt action. This type of attitude could therefore be ap- 
praised best bv means of anecdotal recordings, behavior 
records, and observational checklists to be devised by each 
school for its own use.'? 

Another series of attitudes also permits expression in overt 
behavior, but social conventions and personal inhibitions 
tend to suppress that expression. Attitudes toward the other 
sex, toward family relations, toward certain aspects of one's 
own personality, and so on, are of this sort. Indirect methods 
of appraising these attitudes are necessary. A method of this 


type is described in the chapter on Personal and Social Ad- 
justment. 


13 Several such devices were developed. Behavior records developed 
under the leadership of Eugene R. Smith will be discussed in Part II of this 
book. The Francis Parker School developed a checklist, *Record for De- 
scribing Attitudes and Behavior in High School" covering: I, Cooperation; 
IL Responsibility; and II, Attitude toward School Work. A somewhat 


similar scheme for collecting anec E DE : 
Hill School, ecting anecdotal records was adopted in the Tower 


APPRAISING STUDENT PROGRESS 207 


A third group dealt with such social issues as international 
relations, unemployment, freedom of speech, and democracy 
in school. While measurable consequences in overt behavior 
attend some of these attitudes, their expression is largely 
confined to a theoretical or verbal Jevel. Even adults as indi- 
viduals have only limited opportunities for expressing their 
beliefs through overt action. Thus, for example, belief in the 
desirability of government aid to agriculture would in the 
case of most people be expressed in verbal arguments, in 
taking sides on ideas presented in print, or in writing about 
these issues. Only such “token overt action” as writing to 
one’s Senator or casting a vote on certain measures affecting 
the issue seemed to be open to the majority of people on a 
great many social issues. On the other hand, in a democracy 
the beliefs held by people influence social action by groups, 
and consequently a great deal of effort is directed toward 
clarifying beliefs and opinions on controversial issues. It was 
therefore thought important to appraise the development of 
these beliefs even though the appraisal would have to be 
confined to verbal expression of beliefs. Scales of beliefs in- 
Viting reactions toward statements of opinion on significant 
Social issues seemed the most economical and appropriate 


method for appraising attitudes of this sort. 


Areas of Social Beliefs 
One of the first tasks in developing an instrument to eval- 


uate social beliefs was to secure suggestions regarding the 
covered in the appraisal. 


Major areas of social beliefs to be соус 
Obviously it is possible to have a belief about almost any- 
, 

i thi overed by the term 
thing, and almost anything can be c Pow 
Social" It was clear also that certain of the possible areas 
Of social beliefs were of more concern to schools than ies 
Others, The schools were: therefore, asked to suggest the 
areas of sociak beliefs in which they were paa a pa 
eral cases both students and parents as well as teachers par- 
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ticipated in this exploration. The rating scales and attitude 
tests already in use in schools were also examined. Samples 
of student writing were analyzed, as were their choices of 
“research” topics and free reading. In some classes daily logs 
of topics of discussion were kept. 

When compiled, these suggestions included the following 
areas of social issues: democracy—political and economic, 
the role of the machine and invention in contemporary civ- 
ilization, consumer problems, use of natural resources, labor, 
unemployment, housing, nationalism and internationalism, 
war and peace, school life, religion, and family. Some of 
these were mentioned by all schools and others by only a few. 

In order to provide means of appraisal of so varied a range 
of social beliefs, a series of instruments was developed. With 
the exception of one instrument devoted to appraisal of be- 
liefs on issues of school life, all of them deal with large social 
issues. The following list indicates the scope of this project. 


І. Beliefs on Social Issues (Form 4.21-4.81), an instru- 
ment covering general social issues. Two forms were 
developed, one for the senior high school level, an- 
other for the junior high school." 

2. Beliefs on School Life (Form 4.6), an instrument 
covering issues in the area of school relationships. 


These two instruments included issues which were sug- 
gested by a large number of schools and were designed for 
general use. In addition, several instruments were developed 
for more specific purposes. These included: 


3. Beliefs on Economic Issues. This was made for 2 
school particularly interested in developing economic 
attitudes through the study of selected short stories 
and poems. 


* Another form (4.9-4.10) included religi ily life i iti 
to the areas covered in Form 421-431. н UNI eee 
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4. A series of instruments sampling in detail beliefs on 
such issues as Men and Machines, Distribution of 
Wealth, Consumer Problems, and Use of National 
Resources, designed for a school emphasizing these 
particular problems. 

5. Beliefs on Housing in your Community, for two 
schools conducting an intensive study of housing. 


Of these, the development of the instrument Beliefs on 
Social Issues is discussed in detail in this chapter. Brief ac- 
counts are given of the Beliefs on School Life and Economic 


Beliefs, 


EVALUATION OF BELIEFS ON SociAL Issues 


Before an instrument suitable for appraising beliefs on 
Social issues could be developed, it was necessary to (1) 
Select the areas of issues to include, (2) determine the types 
of sub-issues to sample in each area, (8) decide on the level 
of intensity at which each of the statements in the test should 
be formulated, (4) designate the characteristics of beliefs 
Which were to be measured, and (8) choose a technique 
appropriate for securing and summarizing the responses of 
Students, This section summarizes the preliminary investiga- 
tions which influenced the final decision on these problems. 


Sampling of Issues and Formulation of Statements 

From the list submitted by the teachers, six areas of inter- 
est to many schools were chosen by the committee. These 
Were: democracy, economic relations, labor and unemploy- 


ment, race, nationalism, and militarism. The problem of 
, sues to be sampled in each area 


determini ific i 

mining the specific 1s 2 

and their ef: direction was а more difficult one. To have 

a discriminating instrument, it is not only necessary to sam- 
5 an issue but also to sample the 


ple the significant aspects of 
major eme in beliefs about these aspects. Each one of 
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the major areas chosen was broad enough to involve a host 
of more specific aspects. Thus the issue of equality of races 
involves such specifics as equality of work opportunity, of 
education, of political and civic rights, of social relations, 
and so on. A quite different set of sub-issues appears when 
the causes and consequences of racial equality or inequality 
are considered. The positions taken toward each of these 
aspects of racial equality may differ considerably in the case 
of the same individual, as well as from individual to indi- 
vidual. Thus those who believe that Negroes should have 
educational opportunities equal to those of whites may not 
believe that both groups should have equal opportunities for 
every kind of work. 

For an effective appraisal of beliefs it is also important 
to determine a reasonable threshold for each statement. A 
statement of a position toward any social issue can be 
phrased with any degree of intensity. It can be phrased so 
strongly that very few people can agree with it, or so mildly 
that most people responding to it can agree with it. Thus, a 
statement expressing opposition to equality for Negroes could 
be phrased to deny any, form of equality or permit only cer- 
tain kinds of equality. A statement implying low standards 
of morality or lack of intellectual ability could be applied 
to all Negroes, or only to Negroes of certain social status, 
and so on. Effective statements for the purpose of the meas- 
uring instrument are ones which elicit a reasonable amount 
of both agreement and disagreement from the students. 

Interwoven with this problem of threshold is the question 
of the use of language in the statement of beliefs. Because 
of the general nature of the issues, a certain degree of ab- 
stractness in stating them seemed unavoidable, Abstract 
terms, however, are often subject to different interpretations 
by different people. Statements of opinion frequently necessi- 
tate the use of emotionally colored words, the interpretation 
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of which varies from person to person. Care was therefore nec- 
essary to avoid words likely to be ambiguous to the students 
or likely to create emotional reactions causing an interpreta- 
tion irrelevant to the intended meaning of the statement. 
To get suggestions on how to deal with these problems, 
students in several schools were asked to submit statements 
of opinion on issues in each of the six areas chosen. Several 
hundred statements of opinion were collected in this way. 
A selection of these chosen from each area was resubmitted 
to the students. Thev were asked to indicate their agreement 
ment with each of the statements and then to 
groups, ranging from the 
pposition to ones stating 


or disagree 
arrange all of the statements in ten 
ones they thought stated strong О 
Strong approval of the central issue in each area. 

The results from these studies were used in several ways. 
By a priori analysis, lists of important issues to be sampled 
in each area had been drawn up. These lists were checked 
against the items suggested by the students to eliminate any 
issues of which students did not seem to be aware. The re- 
duced lists of issues then served as a basis for formulating 
statements for the test. In the area of democracy, for exam- 


ple, the statements sampled the following issues: 


1. Civil liberties, such as freedom of speech, the right to 
trial bv a jury, and the right to vote. 


2. Equality of opportunity and responsibility in a democ- 
racy such as equality in economic and educational op- 


1 T ibility in carrying 4 
portunities, and equality of responsibility in carrying the 


"er nt. 
financial burden of the governme | 
«ting and electing government officials 

8. Manner of appointing ? g 


a representatives. ; 
4 F * Pes є md vesponsibilities of democratie government 
+ Functions а roviding i 
in promoting general welfare, such as providing medical 
55 : n 
al security for all. : 
care and socia nsibilities of citizens in a democracy. 


Freedoms and respo | ; 
6. Influences of social and economic classes on democracy. 


g 
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From the students' responses it was also possible to deter- 
mine the kinds of statements which were so extreme as to 
elicit either a unanimous agreement or a unanimous dis- 
agreement. Usually only the items on which there was a 
reasonable division of opinion were chosen. In a few in- 
stances, however, items were retained because they were 
considered important and because there was reason to be- 
lieve that unanimity of opinion was caused by some special 
factor in the background of these students rather than by 
the fact that the issue was not in general a debatable or a 
significant one. Whenever possible, the terms used by stu- 
dents were employed. All statements were scrutinized by a 
jury of 12 persons for possible ambiguity, or other verbal 
difficulties, and for their relevance to the major issue. 


Characteristics to Be Diagnosed 

In considering the characteristics of beliefs, three were 
found to be of importance to schools. In the first place, the 
teachers wanted to see whether increased understanding of 
social problems brought about an ability and willingness to 
take personal positions upon an increasing range of social 
issues. One of the main criticisms of social education in 
schools had been centered on the failure to develop in stu- 
dents personal viewpoints toward important social issues. 
It was therefore decided that the prospective instrument 
should be so set up as to diagnose the extent to which 
students are able and willing to take « definite stand on 
social issues. 

Teachers were also interested in learning about the direc- 
tion of positions taken by the students. Thus they wanted 
to know whether on the whole students accepted or rejected 
the principle of universal freedom of speech, whether stu- 
dents were for or against certain measures to alleviate pov- 
erty and unemployment, and so on. This interest in the type 
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of positions taken did not imply a decision regarding the 
desirability of any one specific position, however. While 
there was a fairly close agreement among the teachers on 
the desirability of developing acceptance of democratic proc- 
esses and of racial tolerance, it seemed both impossible and 
undesirable to classify the positions on many other issues as 
desirable or undesirable. At the same time, it seemed neces- 
sary to adopt some scheme of distinguishing and classifying 
the positions taken toward the statements of opinion in- 
cluded in the test. Unfortunately, most of the terms used to 
refer to the direction of attitudes suggest an idea of right- 
М approval or condemnation of a given 


ness or wrongness 
e committee wished to avoid 


position. The members of th 
such terms for summarizing the test results, but found it 
impossible to locate any terms which did not have such con- 
Notations, The terms liberal and conservative were finally 
adopted as a convenient way of describing two opposite 
directions on issues selected for the test. The meanings 
adopted for these terms will be discussed later in connection 
with the description of the scoring and summarizing of the 
responses. 

The consistency of students’ beliefs was a third character- 
istic teachers wished to diagnose. Teachers regarded con- 
Sistency as a desirable characteristic of social beliefs, no 
matter which position was taken. The committee recognized 
at least two levels of consistency. Generalizing a multitude 
of specific beliefs in different areas into a coherent and cón- 
Sistent viewpoint represented one level. Inconsistency in this 
Case would reveal itself by a shift of viewpoint from area 


to area. The other and more specific level involves the con- 
ne issue. Inconsistency in 


Sistency of beliefs toward the same 18% tency 
ith expressions of opposite view- 


this case means agreement w di 
Points on the-same issue. It seemed possible to iagnose con- 
Sistency of the first type by examining the direction of be- 
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liefs in each of the areas. To get evidence on consistency of 
the second type, two statements expressing opposite view- 
points on each issue were included in the instrument. 


Techniques of Constructing the Scale 


There are several possible techniques of securing and 
summarizing the responses of students to statements of is- 
sues. Thurstone regards the intensity of a feeling or position 
as the most significant characteristic of attitudes, and has 
developed a series of scales measuring the intensity of the 
favorable and unfavorable positions toward a single issue, 
such as war and peace.” Each statement in a scale contain- 
ing 20 or more represents a position toward a given issue, 
these positions ranging from intense opposition to intense 
approval, with a neutral zone in the middle. A quantitative 
“scale value” is assigned to each statement and the student's 
score is expressed as the median of the scale values of the 
statements he endorses. Low scores indicate opposition and 
high scores indicate approval. Another approach is used by 
Neumann.’ He attempts to combine a survev of various in- 
ternational issues with a measure of the intensity of reac- 
tion toward each one. He accomplishes this by including 
statements on a series of issues and by directing students to 
mark each statement by indicating five degrees of reaction 
ranging from strong approval to strong disapproval. 

Although several schools in the Study used Thurstone's 
scale for measuring Attitude Toward War, and tried out 
experimentally a modified form of Neumann's Attitude In- 
dicator, the committee decided that a still different tech- 
nique would be more useful in serving the purposes of these 


38 T; L. Thurstone and E. J. Chase, The Measurement of Attitude (Chi- 
cago, University of Chicago Press, 1929), pp. 10-12. 


16 АТ 
Planen хш Pv. A Study of International Attitudes of High School 
University, 1926). » Teachers College, Bureau of Publications, Columbia 
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particular schools. It was believed that separate scales, each 
of which focuses on a single major issue (e.g., war or reli- 
gion), make it relatively easy for a student to decide what 
is likely to be the “acceptable” position and to respond ac- 
cordingly, thus raising questions as to the validity of the in- 
strument as an indicator of the student's "real" attitude. This 
aspect of validity might be at least partially protected by 
mixing statements on a variety of issues in the same instru- 
ment and avoiding the use of titles which would reveal the 
major issues included. Moreover, it seemed more important 
to the schools to appraise the positions on a range of sub- 
issues under each major area of issues than to scale in detail 
the intensity of each position. To attempt to do both would 


probably result in an instrument too long for practical use. 


All of these considerations influenced the technique which 


was eventually chosen and which will be described in the 
next section. 


Description or THE TEST ON BELIEFS ON Sociar Issues 
(Form 4.21-4.31) 


After the above-mentioned problems had been considered, 
a plan emerged for a new instrument to measure Beliefs on 
Social Issues. In the present form it consists of 200 statements, 
classified under the following areas of issues: democracy, 
economic relations, labor and unemployment, race, nation- 
alism, and militarism. Students respond to each statement 
by indicating agreement, disagreement, or uncertainty. The 
statements are arranged in random order and are presented 
to the students in two sections given at different times. For 
each statement in the first section there isa statement in the 
second section representing the opposite жш еа 

А sample of the statements 15 given elow. 5 s 
from the two sections of the test are shown together. The 


ey is inserted after each statement. 
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4.21 


4.31 


4.21 


4.31 
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mn 


14. 


104. 


Democracy 


- Complete freedom of speech should be given to all 


groups and all individuals regardless of how radical 
their political views are. 
(A, Liberal; D, Conservative; U, Uncertain.) 


. Freedom of speech should be denied all those groups 


and individuals that are working against democratic 
forms of government. 
(D, Liberal; A, Conservative; U, Uncertain.) 


Economic Relations 


. Since the welfare of a whole nation depends on its 


natural resources, their use should be subject to pub- 
lic control. 
(A, Liberal; D, Conservative; U, Uncertain.) 


- Those who own oil wells, coal mines, and other nat- 


ural resources should be allowed to operate them as 
they think best. 
(D, Liberal; A, Conservative; U, Uncertain.) 


Labor and Unemployment 


Most workers who are unable to provide for them- 
selves during a period of unemployment have been 
too shiftless to save. 

(D, Liberal; A, Conservative; U, Uncertain.) 

The wages of most workers are so low that it is im- 
possible for them to save enough money to support 
themselves during periods of unemployment. 

(A, Liberal; D, Conservative; U, Uncertain.) 


Race 


. It is all right for Negroes to be paid lower wages 


than whites for similar kinds of work. 
(D, Liberal; A, Conservative; U, Uncertain.) 


. The same wages should be paid to Negroes as to 


whites for work which requires the same ability and 
training. 
(A, Liberal; D, Conservative; U, Uncertain.) 
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Nationalism 
4.21 79. Our government ought to protect American business 
interests in foreign countries even if it involves using 


our army and navy. 
(D, Liberal; A, Conservative; U, Uncertain.) 


431 } 189. Our government should not risk a war to protect 
American business interests in foreign countries. 
(A, Liberal; D, Conservative; U, Uncertain.) 


Militarism 


421 | 35. The amount of profit made from the sale of war 
materials should be strictly limited. 
(A, Liberal; D, Conservative; U, Uncertain.) 

431 | 182, Men should be allowed to make profits out of muni- 
tion making just as they are allowed to make profits 


from other business enterprises. 
(D, Liberal; A, Conservative; U, Uncertain.) 


Scoring and Summarizing the Results 
The responses to the whole test as well as to each of the 
areas are summarized under four main headings: liberalism, 
Conservatism, uncertainty, and consistency. No attempt was 
made to arrive at a categorical definition of the terms liberal 
conservative. These terms were adopted for convenience 
Only and carry a somewhat different connotation with refer- 
ence to each area. The liberal point of view in the area of 
€mocracy, for instance, tends to endorse freedom of speech; 
€mocratic processes in government; responsibility of the 
Sovernment for promoting the welfare of all groups in soci- 
ety with respect to health, security for old age, and the pro- 
tection of consumers; and reinterpretation of the Constitu- 
tion and other basic laws in keeping with present-day social 


and economic demands. The conservative position tends to 
of speech, to limit the re- 


appr 

Pprove restrictions on freedom she T 

роп Бу of the government for social welfare, and to 

avor a strict interpretation of th 
In the area of economic relation 


e Constitution. 
s, the liberal position tends 
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to endorse government regulation of public utilities, natural 
resources, wage levels, insurance, and to approve of moving 
in the direction of production for use rather than for profit. 
The conservative position represents the policy of economic 
individualism, the policy of laissez faire, and the preserva- 
tion of the profit system in unrestricted form. 

With respect to labor and unemployment, the liberal posi- 
tion tends to favor collective bargaining; to approve of social 
legislation providing for minimum wage levels, health insur- 
ance, and unemployment relief; and to maintain that unem- 
ployment is caused by social conditions bevond the control of 
individuals, and hence that its consequences should be borne 
by society rather than by the individuals who happen to be 
affected by it. The conservative position tends to oppose the 
organization of labor for collective bargaining; to oppose 
labor legislation or expenditure of government funds for re- 
lief of unemployment; and to maintain that unemployment 
is caused by some deficiency of the individuals, and hence 
that the consequences should be borne by those who happen 
to be unemployed. 

In the area of race, the liberal position tends to endorse 
the equality of all races as far as social, economic, and educa- 
tional opportunities are concerned, and to deny that racial 
inequality is inherent or inborn. The social, economic, and 
educational status of Negroes as a group is attributed to en- 
vironmental conditions rather than to hereditary causes. The 
conservative position accepts the inherent supremacy of the 
white race and indorses racial discrimination of all sorts. 

A pacifistic viewpoint represents liberalism in the area 
of Militarism: that is, the tendency to favor arms limitation, 
arbitration, and condemnation of war.as a way of settling in- 
ternational troubles. Belief in the inevitability of war, in 
armed preparedness, in the use of armed force, and in the 
benefits of military training for character development illus- 
trates the conservative position. 
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In the area of nationalism, a liberal viewpoint is ascribed 

~ to those who аге internationally-minded, who recognize the 
worth and the contributions of other nations, and who deny 
that there is need for protecting a nation's imperialistic ecc- 
nomic enterprises abroad with armed forces. A conservative 
viewpoint is associated with emphasis on national glory and 
honor, and the belief that American ways would be best for 
other peoples; it tends to defend the notion of the supremacy 
of America and of things American in all respects and to 
insist on the use of American standards in judging the con- 
tributions of other nations. 
In all areas the uncertain response is taken to mean either 
that the student does not understand the statement or that 
he is unable to take a position regarding the issue because 
of conflicting ideas about it. It wa$ also anticipated that a 
relatively high degree of uncertainty might characterize the 
Position of the more thoughtful students. Consistency indi- 
cates the extent to which students take a similar position 
twice on the same issue; i.e, do not agree with both of two 
contradictory statements. The tendency to take a similar posi- 
tion on a range of issues in one area or in different dn 
indicated by tha percentage of liberal and conse! vative re- 


s à 
ponses in each area. 


As can be seen from the data sheet, these four headings 


(liberalism, conservatism, uncertainty, and consistency ) are 
used to summarize both the total scores and the ope 
Or each of the six areas. NO such headings are p a би 
instrument itself, and the student is not ayare € his p 
Sponses are to be classified in this way. Moreover, : pi 
lie emphasized too strongly that, as far as thé oe 9а ы 
Concerned, there is no implication that a [ a inde 
the conservative position is to be preferred. Pis 


i ve of beliefs; the 
15 designed {с measure the sta g 


tus or chan P 1 
ability irection tha 
Problem of determining the desirability of the 
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the beliefs of students should take is a responsibility of the 
schools. 


Explanation of the Data Sheet 


three questions centering about the direction, uncertainty, 


is it distributed? 
The scores on liberalism (columns 25 and 1-6) indicate 
the per cent of the statements to which the student re- 


A, for example, has 
per cent of all items 
1 furthermore, are distrib- 


even distribution of 
(columns 7-12) 18 Stu- 
as far as his total score 


pacifism (M, 78), but is at the Same time inclined to reject 
collective bargaini 


п ures to combat unem- 
ployment (LU liberalism, 44, conse: 


> 


type of fluctuation can be Observed i 


27 ey, " “ 

ы р ae s Meal = Conservative” аге used throughout this sec- 
ing the more lengthy references to thei ing i 

А г meaning іп 
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and labor and unemployment (LU, 14) he makes few liberal 
responses, while his score on toleration of racial equality 
(R, 88) is very high. In his case, however, the absence of 
liberal responses cannot be interpreted as a rejection of this 
position. His scores on uncertainty in these two areas (un- 
certainty: ER, 60, LU, 58) indicate that in these areas he 
has difficulty in taking a position. In the few instances he 
oes so, the responses in the conservative direction prevail 
(ER: liberal, 7, conservative, 33; LU: liberalism, 14, con- 
servatism, 30). 

The second question is: To what extent are the students 
willing (or able) to take definite positions on these sociai 
issues? 

The uncertainty (columns 27 and 13-18) scores give the 
per cent of responses in which a student neither agrees nor 
disagrees with the statements. High uncertainty might mean 
desirable caution, inability to understand the statements, 
lack of information, or lack of conviction. In most cases this 
response seems to mean “I don’t know or I can’t decide,” 
for socially conscious and active students usually have low 
“uncertain” scores. Thus, Student C is very uncertain of his 
position on all of the issues with the exception of those per- 
taining to race. He scores far above the median for the class 
on total uncertainty (column 27), and in five of the areas. 
Students A and D indicate little uncertainty as to their posi- 
tions. Neither extreme certainty nor extreme uncertainty in 
themselves are desirable. Whether or not either can be con- 
sidered desirable depends on the total pattern of scores. 
Thus, certainty combined with high consistency is more ac- 
ceptable than high certainty combined with low consistency 
because flexibility is important as lorg as there is confusion. 
Experience with test data has shown also that ceriainty com- 
bined with high conservatism is not as desirable from the 
standpoint of growth as is high certainty combined with 
high liberalism. This conclusion was drawn because it was 
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found that conservative beliefs were more frequently bor- 
rowed beliefs, while liberal beliefs were more often arrived 
at through personal thought and consideration. In interpret- 
ing the meaning of high or low uncertainty, however, the 
developmental trend of the student needs to be considered. 
Thus one would expect an increase in uncertainty whenever 
an individual is in a state of transition from one type of 
Social viewpoint to another. 

The third question is: To what extent are the students 
Consistent in the positions they take? 

The consistency (columns 28 and 19-24) scores give the 
per cent of consistent responses on the total test and in the 
areas listed above. High scores in these columns indicate 
clarity of outlook, whether it is liberal or conservative in its 
direction, Low consistency may occur for at least two rea- 
sons. Students may be inconsistent because of inability to 
think through their beliefs or because they are actually 
embracing conflicting positions. In the first case, there is 
likely to be an even distribution of inconsistency scores in 
all areas, In the other case there is more likely to be high 
Consistency in some areas and low consistency in other areas. 
While high consistency can be generally regarded as a de- 
Sirable characteristic, one must be aware that often incon- 
Sistency is a by-product of transition from one pattern of 
beliefs to another. In the latter case, low poem wd 
be an index of change and may be temporary. Whether this 


is true or not can be determined if the test is readministered 
riate interval of time and a description of the 


ace in students is secured. 

sistent of the four students whose 
heet. Student B shows a vari- 
On racial issues he is rather con- 
but on issues of labor and unem- 


ent student in his entire group 
onsistency from 


after an approp 
‘inds of changes taking pl 
Student A is the most con 
Scores are given on the dat 
able pattern of consistency: 
ем (consistency: R, 80), м 
Syment he is east consis: nat 
M ma ит Similar fluctuation m с 
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area to area is shown by Students C and D. Student D is 
rather consistent on issues in economic relations (consist- 
ency: ER, 80) and relatively inconsistent on racial issues 
(consistency: R, 40). 

The scores on liberalism, conservatism, and uncertainty 
are interdependent and must be viewed in relation to each 
other. This can be illustrated by comparing the scores on 
economic relations for Students C and D. Both of these stu- 
dents have low scores on liberalism in this area, but while 
Student C is rather highly uncertain, Student D is highly 
conservative. Thus scores on liberalism alone tell only part 
of the story. One can infer that the low score on liberalism 
in the case of Student C results from the fact that he has not 
made up his mind on many of the issues. Student D, how- 
ever, seems to have definite convictions about economic 
relationships. For this reason the interpreter must, in addition 
to studying each score independently, consider the whole 
pattern of scores before arriving at a final judgment about 
a student or groups of individuals. 

Several other general considerations apply in interpreting 
different combinations of score patterns. Thus, when the 
Score on uncertainty is unusually high, the scores on both 
liberalism and conservatism are of necessity low. In such 
cases one can interpret these scores better by comparing 
them with each other than by comparing each with the 
median. Thus, in the case of Student C one might say that 
whenever he makes up his mind on economic relations his 
position will be predominantly in the conservative direction, 
because 33 per cent of the items are marked in the conserva- 
tive direction while only 7 per cent of the items are marked 
in the liberal direction. High scores on uncertainty, coupled 
with high scores on consistency, are more likely to be an 
indication of intelligent doubt than of mere confusion and 
inability to see the issues clearly. Conversely, lack of un- 
certainty where inconsistency is high would indicate a pre- 


MEL LLL. 
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mature feeling of security about beliefs which in reality are 
confused. Decisions such as these concerning the relative de- 
Sirability of high or low scores on liberalism are left for the 
teacher to make. 

Although in the course of the above discussion comments 
have been made concerning the scores of four students, no 
attempt has here been made to present a complete and co- 
herent account of the beliefs of these students. The data on 
each student and each group of students made available by 
this instrument are too extensive to permit the presentation 
Within the limits of this chapter of a complete treatment of 


the possibilities of interpretation. 


Validity and Reliability 
Several factors influence the validity of this instrument. In 


the first place, there is the problem of the role of language 
їй expressing feelings and viewpoints. In statements of is- 
Sues terms which have different meanings for different indi- 
viduals are apt to be used. The expressions of attitudinal 
Positions also require the use of some words or ideas to 
Which strong emotional reactions are attached and these re- 
actions usually are not the same from individual to individ- 
ual. Certain words may evoke responses somewhat inde- 
Pendent of, or irrelevant to, the meaning and intent of the 
Whole statement. Also involved is the fact that many indi- 
Viduals are not clear about their own beliefs. Those who tend 
to be confused or uncertain about their own positions are 
apt to respond more or less automatically to familiar ter- 
minology in place of attempting to н epus ted nd 
beliefs are. Moreover, it i5 likely that p with no 

efinite beliefs on a given issue moy be induced to give 

efinite responses pady because familiar verbal stereo- 


Pes are presented to them. ; " 
Secondly, there is the problem of securing vae б A 
Sponse, Social beliefs аге somewhat in the realm o p 


E 


226 ADVENTURE IN AMERICAN EDUCATION 


vate life of an individual and he is not always willing to 
reveal them. There are either general social pressures or pres- 
sures in a given group toward the "right way of believing," 
and individuals whose personal beliefs differ from the pre- 
dominant ones may feel threatened in disclosing them. Thus, 
often in a school where the majority of students are liberal 
in a certain respect, those who do not share the liberal view- 
point are put on the defensive. This applies also to teacher- 
pupil relations. Even in responding to an instrument of this 
sort which is not, properly speaking, a “test,” students are 
apt to try to live up to the expectations of teachers who are 
known to favor certain viewpoints rather than to express 
their own viewpoints. It is for reasons like these that the 
question of validity is peculiarly complex in the measure- 
ment of social beliefs. 

An additional difficulty lies in the fact that the social be- 
liefs of individuals are rarely so generalized that the subjects 
mentioned in the statements do not affect the response. 
Thus, in securing opinions upon the issue of government 
control vs. economic individualism, it may make a consider- 
able difference whether the issue is stated with reference to 
public utilities or to railroads, whether the object of con- 
trol is profits or wages, and so on. Ideally, the specific issues 
used in the test should include all of these variations. Since 
this ideal cannot be achieved in a test of this sort, one is 
faced with the problem of sampling and of the reliability of 
the sample. : 

. The efforts made in the process of construction to assure 
high validity for the test were described above. Summarized 
briefly, these consisted of securing a clear delimitation bv 
the committee of the behavior to be measured, and of utiliz- 
ing statements from students in deciding which specific is- 
sues to include, in determining the level of intensity at 
which statements should be formulated, and iri phrasing the 
statements. Finallv. the instrument was revised several times 
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on the basis of analyses of student responses to tentative 
forms. 

In addition to the above precautions, several studies of 
the validity were conducted." In the first study the instru- 
ment was given to 65 junior and senior classes studying 
American history and sociology in a large public high school. 
Verbal descriptions of the beliefs of these students based on 
their numerical scores were made and these were discussed 
with the cooperating teacher. The validity of the scores in 
each area in the scale was considered separately. The teach- 
ers judgments of the social attitudes of the students as re- 
vealed by his observations in the classroom coincided with 
the interpretations of the scores from the test in 90 per cent 
of the cases. 

Thirty of these students were interviewed. They were 
chosen on the basis of the test scores 50 that they repre- 
sented the ten most conservative, the ten most liberal, and 
the ten most inconsistent and uncertain students in the entire 
group. The questions asked in these interviews paralleled 
the statements of the test. Some of the students were ques- 
tioned regarding their points of view within a single area; 
others were interrogated with respect to two, three, or even 
all six areas. When the information obtained in this way was 
compared with the test results, the two sets of data were 
found to be fairly consistent; that is, the direction of points 
of view, the certainty, and the consistency of the students 
as revealed by the test were very closely related to those 
indicated by their verbally expressed opinions. 

A second study of the validity of the instrument was car- 
ried out ina ninth grade social science class composed of 18 


19 These validity studies Were conducted by Paul R. d ang ne = 
cussion here summarizes his findings described in аз up i red h- 
nique for the Evaluation of Attitudes in the Social Stu ies, pei 
submitted to the Ohio State University in 1939. Dr. Grim's study was = le 
in connection with the Form 2-4.3. Only slight revisions were made їп 
the Form 4.21-4.31. 
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students. Written descriptions of their social beliefs as re- 
vealed by test scores were made. Apprentice teachers col- 
lected hundreds of anecdotes pertaining to expressions of, 
and behavior relative to, the social viewpoints of the stu- 
dents, and also examined these students’ written work. They 
then summarized their findings by rating these students on 
a five-point scale for liberalism and for consistency in each 
of the six areas. It was found that over 90 per cent of the 
judgments of the teachers coincided with the test ratings. 
The students in this group were also interviewed. In 17 out 
of the 18 cases, the opinions expressed in the conferences 
conformed closely with the responses to the test. 

In one study of reliability, coefficients?" for this test based 
on a total population of 600 students selected from 14 schools 
and representing grades nine through twelve were com- 
puted. The results were as follows: On liberalism they 
ranged from .79 to .86 for the different areas; for the total 
score on liberalism the coefficient was .95. On conservatism 
they ranged from .72 to .81 in different areas; the reliability 
coefficient for the total score on conservatism was .93, On 
uncertainty the range of reliability coefficients was from .79 
to .85, and a coefficient of .96 was obtained for the total 
score. On consistency the reliability coefficients ranged from 
45 to .61, with a coefficient of .85 for the total score.?! These 
data check rather closely with those obtained in other studies 
from other populations and by other methods. The scores in 
the test are stable enough so that, within appropriate statis- 
tical limits, they may be used for diagnosis of individual as 
well as group differences. 

As can be seen from these data, the stability of the scores 
by areas is a good deal lower than the stability of the total 
scores. The scores on consistency by areas have particularly 


? Estimated by the Kuder-Richardson formula. More complete data on 
reliability and other statistics are given in the Appendix. " 

?!Since pairs of items are scored to determine consistency, the test is 
in effect only half as long for this purpose. 
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low stability and can be used only to designate the extremes. 
All other scores used within the content of the whole pat- 
tern of scores and within appropriate statistical limits, can 
be used for helpful diagnostic judgments regarding the 
nature of social beliefs. 


BELIEFS ABOUT SCHOOL LIFE 


Another scale of social beliefs (Beliefs about School Life, 
Form 4.6), was devoted to the area of school life. 


У 
regarding various aspects of school 
ant for several reasons. In the 
w on such matters as grades 


, Appraisal of the beliefs 
life was considered import 
first place, students’ points of vie 
and awards, methods of teaching, and ways of conducting 
the school government, determine to a considerable extent 
the type and the effectiveness of their adjustment to school. 
The beliefs prevailing among students on these matters also 
influence the organization and functioning of the school since 
Students’ beliefs play an important part in motivating their 
behavior in specific situations. Finally, certain of these be- 
liefs represent aspects of “democracy in school” and as such 
are considered in many schools as desirable ends in them- 
Selves, Awareness of the nature of these beliefs on the part 
of both students and teachers is helpful in accomplishing 

esirable changes in the school environment or in an indi- 
Vidual student's reactions to that environment. For these 
Teasons a means of obtaining systematic evidence on beliefs 
toward a range of issues about school life was thought to be 
a desirable addition to observations of overt behavior. 


Analysis of the Objective = . 
In order to be sure that the test sampled opinions on issues 


9f concern relative to school life, two investigations oa 
Conducted, First, some students were asked to write brie 
es : : 1.” Their essays discussed 
Says on “Democracy in My School. Sane HE fli 
many kinds of problems, from rules regarding he use of lip- 
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stick to criticism of the course of study which they were 
following. Secondly, a list of the major areas of school life 
and illustrative statements of issues in each area was sent to 
teachers in several schools. They were asked to criticize the 
choice of issues and the tentative list of specific statements, 
and to make additions to either if they thought there were 
important omissions. In analyzing the material obtained 
from teachers and students, it was found that the most fre- 
quently mentioned issues could be classified in six major 
areas: school government, curriculum, grades and awards, 
school spirit, pupil-teacher relations, and group life. These 
became categories of summary for the instrument which was 
developed. This instrument is similar in form to the one de- 
scribed in the preceding section except for the difference in 
content and the fact that no attempts were made to meas- 
ure consistency. It consists of a series of 118 statements of 
opinion, and students respond by indicating either agree- 
ment or disagreement with them, or uncertainty about them. 
In the following paragraphs a brief description of the cate- 
gories and some illustrative statements from the instrument 
are given. 


Description of the Test 


The area of school government samples such issues as 
appropriate bases for electing students to school offices, treat- 
ment of minority groups, appropriate degree of student 
responsibility for the conduct of school affairs, Student re- 
sponses to these items are classified as democratic and un- 
democratic. For example, agreement with each of the fol- 
lowing statements is scored as a “democratic” response, and 
disagreement with these statements is scored as an “undemo- 
cratic” response: в 


19. Criticisms of the school government made by first year 
pupils should be considered just as carefully as criticisms 
which juniors and seniors make. 
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20. The teachers and principal should have pupils help in 
deciding what books to buy for the school library. 


The area of group life involves issues of the status of vari- 
ous school groups and their relations to each other and to 
School. The following problems are included: the extension 
ts only to members of cer- 


of special privileges of various sor 
ass distinctions in terms of 


tain groups, the maintenance of cl 
these groups, and the desirability of characterizing students 
as members of certain groups or cliques rather than as indi- 
viduals. Responses to these items are summarized in terms 
of the number of responses indicating a “social attiude,” 
meaning approval of equal treatment of all groups, and a 
class” attitude, indicating a disposition to approve all kinds 
of distinctions and cliques. For example, agreement with the 
following statements indicates a “class” attitude, whereas 
disagreement indicates a “social” attitude: 

r families in a community and 


6. Pupils from the wealthie y 
ilies should not be put in the 


pupils from the poorer fam 
same homeroom together. 

99. In most cases, it is undesi 
pupils working together in t 
The area of pupil-teacher relations involves problems of 
sharing responsibility between teachers and pupils, and of 
the methods by which the allocation of responsibility should 
be made, The following issues are sampled: sey LK ars 
€gree of pupil-planning of various school acti ш i" ye 
of making decisions, types of problems which teachers alone 
Should solve, Reactions to this group of items are summa- 


" i -ofr indicating approval 
ized in terms of the number of responses indicate PE ud 
ber indicating approva 


i : " 

conperative relations, and the yi е two illustrations of 
of authoritarian relations. Following are tw Leer 
items in this area in which disagreement with th i 

; dl; ve methods and agreement in- 


‘cates approval of cooperativ! tem 
‘cates approval of authoritarian methods: 


rable to have slow and bright 
he same class. 
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2. It is better for a teacher to decide what the pupils are to 


study in a class than to let the pupils plan their work by 
themselves. 


17. Too much time is wasted when pupils take part in the 
discussion of plans for a unit of study. 


The area of curriculum involves issues of educational phi- 
losophy and practice. Responses to these issues are summa- 
rized in terms of liberal and conventional attitudes. A “lib- 
eral” attitude is indicated by an experimental point of view: 
that is, a belief in the integration of school subjects, pupil- 
teacher planning, flexibility in planning units of study, and 
in utilizing community resources. A “conventional” attitude 
is indicated by a disposition to maintain rigid subject mat- 
ter divisions, to prefer teacher-planned courses of study, and 
to emphasize the acquisition of facts and information. The 
following statements are taken from this area: 


11. It would be a good idea for several teachers of different 
school subjects to take part in a class discussion with a 
group of pupils. 

56. Trips outside of the school building should not be taken 
at a time when they interfere with the regular class 
schedule. 


In the above illustration, agreement with the first statement 
indicates a “liberal” attitude, whereas agreement with the 
second indicates a “conventional” attitude toward school 
problems. 

The area of grades and awards samples issues concerning 
the appropriate use of grades and awards, and the types of 
grades and awards which are desirable. For example, such 
statements as the following are made: 


18. If a pupil receives failing grades most of the time, it shows 
that he is not learning anything in school. , 
50. If grades were done away with, pupils would have no 
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way of knowing whether they were making progress in 
their studies. 


Responses to such issues are summarized as non-traditional 
or traditional. “Non-traditional” attitudes are indicated by 
questioning the desirability of using grades and awards as 
incentives, as means of determining participation in school 
activities, and as providing the exclusive measure of the 
value derived from school life. The “traditional” point of 
view is indicated by an acceptance of grades and awards for 
such purposes. й 

The area of school spirit is sampled by issues concerning 
the extent of school loyalty which is desirable, and the types 
of expressions of school loyalty which are appropriate. For 
example, the following statements are offered for considera- 
tion: 


40. We would get some helpful ideas for improving our school 
by visiting other schools to see how they do things. 

102. One of the best ways for a pupil to show that he is a good 

school citizen is always to defend his school when others 


criticize it. 

Agreement with the first statement is classified as a “cos- 
Mopolitan” point of view, agreement with the second as 
a “provincial” attitude. A “cosmopolitan viewpoint is indi- 
cated by a disposition to recognize certain weaknesses in 
one’s own school, a disposition to view the school as a chang- 
ing rather than as an inflexible institution, and a tendency 
toward "worldliness" in one's relations with students from 
other schools. A “provincial” viewpoint is indicated by ex- 
pressing intense loyalty to one’s immediate group to the ex- 
tent of excluding cooperative relations with other groups. 
In addition to the descriptive categories, the number of un- 
certain responses in each area is given. "m — 

As is indicated by the method of summarizing stu os те- 
Sponses, the test may be useful in identifying points of view 
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on the part of an individual student which are likely to be 
hampering his adjustment to, and active participation in, 
school life. It must be noted, however, that the test has not 
been studied sufficiently to warrant a recommendation that 
it be used for precise individual diagnosis. Its primary use- 
fulness is for studying groups. Only students who deviate 
markedly from the group pattern can be identified with as- 
surance as being significantly different from others in the 
group. 

A teacher who wishes to use the test should examine it 
with respect to her own school situation in terms of the fol- 
lowing criteria: (1) Does it sample problems and conflicts 
which pupils in this school must deal with in order to make 
a better adjustment to school life? (2) Are the beliefs to- 
ward school life which are sampled likely to affect participa- 
tion in social movements and processes outside school? (3) 
Does it involve issues regarding educational philosophy 
which are really controversial issues within this school? 
(4) Does it sample beliefs which may provide clues con- 
cerning the behavior of individual pupils in a variety of 
situations in this school? 


BELIEFS ON Economic 155085 


Frequently the Evaluation Staff received requests for spe- 
cialized instruments to evaluate certain unique features of a 
particular school program. One such request was for the de- 
yelopment of means of appraising the effects on social aware- 
ness of the reading of fiction dealing with social problems. 
The literature used in this program described social and eco- 
nomic problems, offered explanations of the causes and ef- 
fects of these conditions, and suggested (in certain cases) 
types of solutions for the problems. 


Analysis of the Objective 


In analyzing the effects of such a program, it was appar- 
ent that they might be classified as follows: (1) increasing 
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student awareness of existing social and economic condi- 
tions; (2) stimulating the development of a consistent social 
philosophy; and (3) aiding students to see the implications 
of their personal social philosophy for concrete action in 
Specific problem. situations. 

Two characteristics were thought important in describing 
awareness or recognition of social and economic conditions. 
First, there is the extent of the awareness or lack of it. The 
extent of awareness may be characterized either by the range 
of problems of which an individual is aware or by the depth 
of understanding about any particular problem. It was de- 
cided that in this instance the range of problems to which 
an individual responds was more significant than the depth 
of his understanding of any one problem. The lack of aware- 
ness may be expressed in several ways. Students may believe 
that conditions are worse than facts indicate, that they are 
better than the facts indicate, or they may feel uncertain 
about either the existence or non-existence of these condi- 
tions. The second characteristic of awareness is consistency. 
An individual who has a clear impression of actual social and 
economic conditions will not agree with both of two plausi- 
ble statements describing exactly opposite ee An 
Instrument designed to measure awareness of ar an -= 
nomic conditions should yield evidence on each of these char- 


acteristics of awareness. E 

An individual's social philosophy may also be аи in 
terms of several characteristics. First, jede ree 
of its general direction: Is it highly indice s а өө 
Оп humanitarian values and sae hae e 
are? Is it dominated by the acceptance © tus quot 


e ial 

sE si Tu hange contemporary socia 
oes it indicate a willingness 10 chang pbi 

о Second, the degree of certainty 


an Р 19 А : 
d economic conditions: cular point of view is 


i : - arti 
With which an individual ина Philos ophy. Certainty may 
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Interest in appraising his soc 7 doro of conviction 
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toward which one indicates a positive point of view. For the 
purposes of this particular appraisal, certainty in the latter 
sense was considered more significant. The third important 
characteristic of a social philosophy is the degree of its in- 
ternal consistency. | 

An individual's ability to see the implications of his social 
philosophy for concrete social action may be described first 
with respect to the predominant type of social action he gen- 
erally approves or disapproves in specific problem situations, 
and in terms of the variety or comprehensiveness of things 
which he agrees should be done. Second, the type of social 
action about which he is frequently uncertain can be de- 
scribed. Third, the types of problem situations in which he 
approves an extensive and far-reaching social action, those 
in which he approves little or no social action, and those in 
which he is primarily uncertain, may be indicated. 
Description of the Test 


On the basis of the analysis of (a) the types of issues 
sampled in the literature and of (b) the nature and charac- 
teristics of the behavior to be measured, a test called Scale 
of Beliefs on Economic Issues was constructed. 
made up of three parts, 

The 


This test is 


first part of the test consists of statements that cer- 


tain conditions do or do not exist in the United States. The 
statements are made in 
dic 


a. 
me 


pairs so that while one statement in- 
ates the existence of a given condition, the other state- 
nt in the pair indicates the existence of exactly the oppo- 
site condition. The student reacts by ir 


ndicating that he 
agrees, disagrees, or is uncertain about each statement pur- 
porting to describe existing conditions. In order to get an 
index of the consistency of his responses the two scales con- 
taining opposite statements are given on different days. Re- 
sponses to this part of the test are sum 


marized in terms of 
the number of answers which indicate awareness of social 
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and economic conditions, lack of awareness of these condi- 
tions, uncertainty about them, and consistency of belief about 
them. 

The second part of the test consists of statements sam- 
pling various points of view regarding the types of condi- 
tions which are desirable. These statements are also made 
in pairs in order to obtain evidence on the consistency of 
the student’s social philosophy. One set of conditions, if con- 
sidered desirable, implies approval of the status quo: whereas 
the other, if followed to its logical implications would in- 
volve changes in the present scheme of things. The issues 
sampled in this section of the test parallel those sampled 
previously. That is, in the first section there is a statement 
as to the extent to which people achieve economic security 
today, in the second section, a statement concerning the 
degree to which people ought to have economic security. 
The student reacts by indicating agreement, disagreement, 
or uncertainty about each statement. A student's responses to 
this section of the test are summarized in terms of the degree 
to which he accepts and approves the status quo, the degree 
to which he accepts a social philosophy which implies change 
in the present order, the degree to which he is uncertain 
about his social philosophv, and the degree to which his 
Social philosophy is internally consistent. 

The third part of the test is made up of a number of prob- 
lem situations describing some specific instances of the con- 
ditions described in the first section of the test. The descrip- 
tion of the problem is followed by five courses of action that 
represent different points of view about what should be done 
about such specific problems. The types of points of view 
sampled in the courses of action have been labeled futile, 
Conservative, compromise, liberal, and radical. These terms 
are not be understood as meaning anything other than con- 
venient summaries of various points of a scale ranging from 
the attitude of “do nothing" to the attitude of “change the 
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whole system.” The student is asked to indicate whether he 
agrees, disagrees, or is uncertain about each course of action. 
His responses are summarized in such a way as to indicate 
the extent to which he agrees, disagrees, or is uncertain about 
each type of social action. 


Users or Turse TESTS 


The fact that a test is valid “in general" does not assure 
that valid results are necessarily obtained in a given school 
or with a given group of students. There are many condi- 
tions which must be fulfilled if these tests are to be useful. 
The most obvious one is that the teacher should be interested 
in developing the kinds of behavior diagnosed in the test. 
Thus the tests dealing with social values and beliefs should 
be considered only if the development of social beliefs and 
the ability to analyze social problems in terms of a personal 
pattern of social values is of concern to the school. 

A certain minimum background on the part of the students 
is also assumed in several of these tests. For instance, to ob- 


tain valid. results from the test on Social Problems (Form 
1.42), it is necessary for stude 


tunity to discuss controversial 


exercise from this test as a 
Specific unit of study, 
dents have had some g 


pre-test, before undertaking a 
This is appropriate when the stu- 
eneral experience with the problem 
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and the teacher is anxious to find out at which level to attack 
the problem with them. 

It is also important for the teacher to decide whether the 
content and vocabulary of these tests are appropriate for his 
group.?* Too often in selecting a test, consideration is given 
only to its appropriateness for a given grade level. Pupils who 
do not respond sensitively to the connotations of the words 
used in these tests will not give an accurate picture of their 
social beliefs and values. The absence of a time limit helps, 
but not sufficiently for many groups. 

The attitudes and expectations of students at the time of 
taking the test regarding the purpose of the test and the use 
of the results are extremely important in all tests in which 
students are expected to express their own viewpoints. If 
the students expect to be graded on such tests, or if for some 
reason they think that they should please the teacher, they 
are likely to mark the test according to their best guess of 
what is ‘expected of them. Certain precautions have been 
taken in the tests themselves to prevent dishonest marking. 
Thus in the Scales of Beliefs the items pertaining to a range 
of issues are in random order to make it more difficult for 
the students to see what the “acceptable” responses might 
be. In the Social Problems test the directions for marking 
the test do not reveal the kind of analysis to be made of the 
responses. No such precautions, however, can take the place 
of a classroom in which the pupils and the teacher trust one 
another. a 

Provided, then, that the qualities diagnosed in the test are 
of concern to the teacher, that the content and vocabulary 
of the tests are appropriate to the level of student develop- 
ment, and that students feel free to express their own views, 
several fruitful uses of the results are possible. In the first 


22 With the exception of the test, Beliefs on School Life, wd can be 
used in grades seven to twelve, none of these tests is ap AP Ha: ZOE bette 
verbal students, nor should they be given below the tenth grade except in 
unusual cases. 
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place, the teacher may want to diagnose the strengths and 
weaknesses of the individuals in his class, in order that he 
may give each one the kind of help he needs. In the case of 
the application of social values, the difficulty of some stu- 
dents may be in their lack of social awareness, while others 
are blocked by their inability to see the implications of social 
values in concrete social problems. Conflicting or confused 
values prevent clear thinking for some students, while gul- 
libility to slogans may be the main difficulty with others. 
Each needs a different kind of help. Experiences necessary 
for broadening awareness do not necessarily contribute to 
greater consistency, The methods employed to clarify values 
and beliefs and ‘to eliminate prejudices differ from the 
methods of building up a more realistic understanding of 
social phenomena. Students whose difficulty is the absence 
of any personal viewpoint are not helped by the kinds of 
experiences needed by those handicapped with entrenched 
biases and Prejudices. The results of the test on Social Prob- 


lems (Form 1.41 or 1.42) throw some light on the needs of 
individuals in these respects, 


If the teacher is 
beliefs, he may wan 


to be confused, to embrace conflicting viewpoints, or have 
this type may also 
Ў understanding difficulties in think- 
ing logically. For example, students who reveal strong preju- 

economic relations in the Scale of Beliefs 
test often make mistakes in reasoning in this area in the 


Social Problems test. The barrier is emotional, not neces- 
sarily intellectual, 
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with social phenomena. First-hand exploration of the com- 
munity and use of literary materials to illustrate social prob- 
lems became a part of most programs. Democratic processes 
in administering school affairs were introduced in the hope 
that personal democratic attitudes might be developed. 
These hypotheses need to be checked by evidence of changes 
taking place in students. Furthermore, curriculum experi- 
ences effective in one respect sometimes produce unexpected 
and undesirable results in some other respect. Thus, courses 
dealing with modern problems, introduced to enlarge social 
awareness, sometimes increase inconsistency and enhance 
ambivalence and confusion of social values. An emphasis on 
democratic processes in school may develop loyalty to certain 
values in this situation, but without proper reference to 
larger social problems, a double standard of democratic 
values may result. 

There are many points at which an objective check is par- 
ticularly needed. One of the most common difficulties in 
social éducation is that students tend to master generalized 
concepts without seeing concretely enough how these con- 
Cepts apply in a variety of life problems. Thus, students tend 
to remember and accept such democratic tenets as equality 
of opportunity or freedom of speech, without recognizing in 
life the problems in which these values are involved and 
the ways in which they are violated. The use of the Scale of 
Beliefs in conjunction with the test on Social Problems shows 
in what degree these difficulties are present among students. 

A teacher may also want to see whether his students are 
achieving an increasingly consistent social viewpoint. Most 
individuals tend to accept values which are in conflict with 


others which they hold at the same time. While one would 
wholly free of these conflicts, one 


not expect anyone to be v à a 
Would hope that with increasing maturity and with increas- 
ing understanding these conflicts would tend to be elimi- 


nated. Often, however, school programs tend to increase 
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these conflicts rather than to eliminate them. This is particu- 
larly the case when the community or the family has a differ- 
ent philosophy from the one emphasized in the school. 

A similar effect is produced when students are exposed to 
many new experiences creating new beliefs and values with- 
out sufficient time to reconsider the values they have already 
developed in their previous experiences. Conflicts are particu- 
larly apt to appear between general beliefs and their specific 
implications. Thus, it is not uncommon to see students ap- 
prove of a more equitable distribution of wealth in general 
and at the same time be violently opposed to such practical 
Ineasures to achieve it as the graduated income tax or mini- 
mum wage law. As long as the school programs tend to em- 
phasize generalities, while experiences at home and in the 
community contribute to the development of specific values 
and loyalties, such conflicting viewpoints are unavoidable. 
An increasing ambivalence and conflict rather than increas- 
ing clarification and integration of social outlook result unless 
teachers are continually aware of points at which individuals 
need help in integrating or clarifying their value concepts 
and beliefs. The examination of the distribution of the scores 
on values in the social problems test 
scales of belief would reve 
points individuals and gro 
values and beliefs, 


In addition to diagnosing the strengths and weaknesses of 
individuals at a given time, teachers may also be interested 
in changes occurring over a period of time. The diagnosis of 
growth is particularly important in connection with the as- 
pects of social sensitivity dealt with in this cha; 


and of the scores on 
al to what degree and at which 
ups are embracing contradictory 
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finally established during the high school years. At best, one 
can hope to establish certain tendencies and predispositions 
and to initiate certain techniques of analysis and inquiry. 
This means that it is important to get evidence of the direc- 
tion of changes taking place in students. Administering tests 
of this sort over a period of time would help determine such 
long-term changes.” 

Generally it is not advisable to use any of these tests less 
than a year apart. They are too general in content, in the first 
place, to reveal minor changes. Secondly, the scores are not 
reliable enough to detect small amounts of change. However, 
the exercises in the test Application of Social Facts and Gen- 
eralizations (Form 1.5) can be used as a pre-test and as an 
end-test in evaluating the effectiveness of a given unit of 


study, within an interval of a few weeks, The we of these 


тесш; as a pre-test would serve two ends: AV Vo ом 
h "e à і 
| е background of the students in order to attack the pr 1 
€m at an appropriate level, and (2 iv i A оч; 
: ppropriate level, and (2) to give direction and 
Impetus to the study. The end-test would show how weil 
Students had mastered the ideas and techniques for under 
Standing a given problem. 

Tt must be pointed out here that while each of these tests 
Was designed as an independent unit, better information 
about the students and the effectiveness of the curriculum is 
Secured when several of them are given and interpreted to- 
gether. This is particularly true of the Scale of Beliefs 
(Form 4.91-4.31) and of the Social Problems test (Forms 
1.41 and 1.42). These two tests were planned as companión 
instruments—one to give an overview of general beliefs, and 
the other to diagnose their application in concrete situa- 
tions. In most cases the data from a single instrument must 

?' The tests of beliefs, such as the Scale of Beliefs on Social Issues (Form 
421-431) can be administered several times. Two forms of the test on 
Ocial Problems (Forms 1.41 and 1.42) have been made available. These 


Su are sufficiently similar to enable teachers to compare scores on one 
orm with those on the other form. 
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be supplemented with other evidence before safe inferences 
can be drawn. This is particularly the case when it is neces- 
sary to carry the diagnosis to the point of locating the causes 
of difficulty. Thus, ambivalence of value pattern may be the 
result of lack of acquaintance with the issues involved, lack 
of ability to see logical relations, sheer inability to read and 
to understand this test, or a genuine division of viewpoint. 
These possibilities have to be checked against other evi- 
dence, such as reading scores, scores on psychological tests, 
tests on logical thinking, or daily observations of students’ 
behavior in the classroom. Only after such checking can the 
teacher be safe in planning the experiences necessary to 
eliminate the difficulties. 

Tn still other cases, the interpreter needs to resort to a more 
detailed analysis of student responses than is possible by 
examining the score sheet. In the case of the Social Problems 
test, some students may have difficulties in connection with 


certain problems and issues and not with others. Whenever 


there is reason to believe that the scores on the data sheet 


have covered up important information, it is profitable to 
examine the answer sheets themselves, 


Chapter IV 


ASPECTS OF APPRECIATION 
EERE EEE KEKE 
INTRODUCTION 


All of the lists of objectives submitted by schools in the 
Eight-Year Study mentioned the development of a wide 
range, an increasing depth, and a personal selection of inter- 
ests and appreciations. Accordingly, an interschool Commit- 
tee on the Evaluation of Interests and Appreciations was 
formed early in the Study and met frequently to analyze 
this area of objectives. One of its first conclusions was that, 
although interests and appreciations are so closely related 
that it is often impossible to distinguish them in specific in- 
Stances, techniques for evaluating them would be sufficiently 
different to justify a division of labor. The committee was 
therefore divided into sub-groups after arriving at a common 
understanding of the objectives to be considered. Many 
subtle distinctions were drawn between interests and appre- 
ciations, but their common purport seemed to be that inter- 
ests emphasize "liking" an activity, while appreciations in- 
clude "liking" but emphasize “insight” into the activity: 
understanding it, realizing its true values, distinguishing the 

etter from the worse, and the like. The sub-committees on 
appreciations developed instruments chiefly in the fields of 
literature and the arts, which are reported in this chapter. 


he work of the Committee on Interests is reported in Chap- 
ter у, 
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APPRECIATION OF LITERATURE 


Since there are somewhat different points of view as to 
what is meant by the objective "Appreciation of Literature," 
it is important to recognize at the outset that the analvsis 
which will be described here is restricted to an analvsis of 
certain types of students' reactions to reading. This restric- 
tion should not be taken to imply that other behaviors might 
not be included under the heading "Appreciation of Litera- 
ture"; a number of articles and studies might be cited to 
illustrate the range of behaviors which have, at various 
times, been identified with appreciation. Carroll? for ex- 
ample, mentions information, sensitivity to style, understand- 
ing of “deeper meanings,” and emotional response as in- 
cluded in appreciation. In developing his tests of prose 
appreciation Carroll chose to measure students ability to 
differentiate the good from the less good and the less good 
from the very bad.* This ability has been regarded by many 
as an important element in, or index of, appreciation. Logasa 
and Wright, to cite a second example, have made a rather 
extensive analysis of appreciation* and have published tests 
of the following behaviors: discovery of theme, reader par- 
ticipation, reaction to sensory images, discrimination be- 
tween good and poor comparisons, recognition of rhythm, 
and appreciation of fresh expressions as opposed to triteness. 
Instead, the restriction mentioned above merely implies a 
selection, on the part of the committee, of behaviors which 
(1) were regarded by them as important aspects of appre- 
ciation, and (2) were not being adequately appraised by the 
available instruments. A major question which the committee 


1 Cf. Broom, M. E., “Literature and Aesthetics,” The High School Teacher; 
VIII (October, 1932), pp. 293-294, 

2 Carroll, Herbert, “A Method of Measurin 
Journal, XXII (March, 1933), p. 184, 

? Op. cit., p. 185. 

* See "Tests for Measurin 
tember, 1925), pp. 491-492. 


g Prose Appreciation,” English 


E Appreciation,” School Review, XXXIII (Sep- 
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wished to be able to answer is: “How do students react to 
their reading?” For convenience, certain of these reactions 
to reading have been designated as “Aspects of Apprecia- 
tion.” 


The Committee’s Analysis of Students’ 
Reactions to Reading 

The Committee on the Evaluation of Reading was organ- 
ized in the fall of 1935. In selecting members for this 
committee the schools recognized that teachers other than 
teachers of literature are often responsible for guiding the 
reading of students and hence should participate in the eval- 
uation of reading outcomes. For this reason, in addition to 
the field of English, other areas, such as social studies, the 
Core program, the school library, and school administration, 
Were represented by various members of the committee, Be- 
cause of the wide geographical distribution of the schools in 
the Eight-Year Study, this committee was divided into two 
sub-committees, one of which met in New York City and the 
Other in Chicago. During the school years 1935-36, 1936-37, 
and 1937-38 a number of committee meetings were held in 
these two cities. The meetings held in New York City were 
attended by representatives of 16 eastern schools; meetings 
in Chicago were attended by representatives of eight schools 
in the Middle West. Members of the Evaluation Staff also 
attended these meetings and coordinated the work of the 
two sub-committees. 

The Committee on the Evaluation of Reading undertook, 
as its first task in developing instruments for appraising stu- 
dents’ reactions to their reading, to clarify what was meant 
by "reactions to reading.” A preliminary analysis of students’ 
Teactions to reading was made, at the request of the commit- 
tee, by Carleton Jones of the Evaluation Staff and was sub- 
mitted to them for revision. After some discussion, the com- 
mittee selected from the preliminary analysis seven behaviors 
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or reactions to reading which seemed to them to be of con- 
siderable importance. These are: 


1. 


to 


Satisfaction in the thing appreciated 

Appreciation manifests itself in a feeling, on the part of 
the individual, of keen satisfaction in and enthusiam for 
the thing appreciated. The person who appreciates a given 
piece of literature finds in it an immediate, persistent, and 
easily-renewable enjoyment of extraordinary intensity. 
Desire for more of the thing appreciated 

Appreciation manifests itself in an active desire on the 
part of the individual for more of the thing appreciated. 
The person who appreciates a given piece of literature is 
desirous of prolonging, extending, supplementing, renew- 
ing his first favorable response toward it. 


- Desire to know more about the thing appreciated 


Appreciation manifests itself in an active desire on the 
part of the individual to know more about the thing ap- 
preciated. The person who appreciates a given piece of 
literature is desirous of understanding as fully as possible 
the significant meanings which it aims to express and of 
knowing something about its genesis, its history, its locale, 
its sociological background, its author, etc. 

Desire to express one's self creatively 

Appreciation manifests itself in an active desire on the 
part of an individual to go beyond the thing appreciated: 
to give creative expression to ideas and feelings of his 
own which the thing appreciated has chiefly engendered. 
The person who appreciates a given piece of literature is 
desirous of doing for himself, either in the same or in à 
different medium, something of what the author has done 
in the medium of literature, 

Identification of one's self with the thing appreciated. 
Appreciation manifests itself in the individual's active 
identification of himself with the thing appreciated. The 


person who appreciates a given piece of literature re- 
sponds to it very much as if he were actually participat- 
ing in the life situations which jt represents. 
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6. Desire to clarify one’s own thinking with regard to the 
life problems raised by the thing appreciated 
Appreciation manifests itself in an active desire on the 
part of the individual to clarify his own thinking with re- 
gard to specific life problems raised by the thing appre- 
ciated. The person who appreciates a given piece of litera- 
ture is stimulated by it to re-think his own point of view 
toward certain of the life problems with which it deals 
and perhaps subsequently to modify his own practical 
behavior in meeting those problems. 

7. Desire to evaluate the thing appreciated 
Appreciation manifests itself in a conscious effort on the 
part of the individual to evaluate the thing appreciated in 
terms of such standards of merit as he himself, at the 
moment, tends to subscribe to. The person who appreci- 
ates a given piece of literature is desirous of discovering 
and describing for himself the particular values which it 
seems to hold for him. 


An example may aid in clarifying each of these seven 
behaviors. Let us suppose that a student has read a particular 
Novel, such as Dickens’ Tale of Two Cities, and that during 
the reading of this book he has read attentively and with 
absorption (1). Let us also suppose that he has derived such 
Satisfaction from the book that he plans to read it again and 
to read other novels by Dickens (2). Perhaps his curiosity 
about Dickens as an author, about the literary currents of 
the middle nineteenth century, about the historical novel as 
4 type, or about the French Revolution has been aroused by 
his reading (3). He might want to sketch Carton riding to 
the guillotine or try to conceive in words some scene or 
Character which grows out of his reading (4). While reading 
16 might “lose himself” in the events of the book, he might, 
ike Booth Tarkington's Willie Baxter, become one with Car- 
ton and feel that “It is a far, far better thing that I до. , > 

k Many problems might be suggested or raised again 


for him by his reading; he might want to think through what 
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friendship or love implies, what the proper ends of life are, 
what terror and force effect in the world (6). Finally, he 
might want to compare this novel with others by Dickens 
and others of its type, compare his judgments of it with 
those of other persons, seek out its values and its limita- 
tions (7). 

This statement of important reactions to reading is a selec- 
tive one and should be regarded as such. A number of other 
reactions or responses to reading might be identified and 
judged to be of importance by other teachers or test makers. 
Pooley,® for example, has made a rather detailed analysis of 
“fundamental” and "secondary" responses to prose and 
poetry which differs somewhat from the analysis accepted 
by the committee. Since our purpose is to report what was 
done by these committees and the Evaluation Staff during 
the period of the Eight-Year Study, a comprehensive discus- 
sion of the many definitions of appreciation or of the many 
possible analyses of responses to reading cannot be given. 
Consequently, the omission of a careful consideration of the 
many studies and tests of literary appreciation which have 
been made by others should not be regarded either as an 
oversight or as evidence of a belief that the work reported 
here exhausts the topic "The Evaluation of Appreciation of 
Literature." 


Instruments Which Were Developed to Appraise 
Students’ Reactions to Their Reading 

A number of instruments were developed for the evalua- 
tion of students’ reactions to their reading. Three of these 
instruments make use of a questionnaire technique which 
consists essentially of asking students to observe themselves, 
in retrospect, and to record these observations. This tech- 
nique was arrived at in the following manner. The commit- 
tee first discussed ways in which the seven types of reaction 


5 Pooley, Robert, “Measuring the Appreciation of Literature,” English 
Journal (High School Edition), XXIV (October, 1935), рр. 621-638. 
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to reading might be manifested in readily observable student 
behavior and prepared a list of overt acts and verbal re- 
sponses which, they judged, would in certain situations re- 
veal the presence or absence of each of these seven types of 
behavior. A few of the overt acts and verbal responses which 
were included in this list are: 


L 


Satisfaction in the thing appreciated 

11 He reads aloud to others, or simply to himself, 
passages which he finds unusually interesting. 

l2 He reads straight through without stopping, or with 
a minimum of interruption. 

13 He reads for considerable periods of time. 

Desire for more of the thing appreciated 


2.1 He asks other people to recommend reading which 


is more or less similar to the thing appreciated. 

2.2 He commences this reading of similar things as soon 
after reading the first as possible. 

23 He reads subsequently several books, plays, or poems 
by the same author. 


- Desire to know more about the thing appreciated 


8.1 He asks other people for information or sources of 
information about what he has read. 

8.2 He reads supplementary materials, such as biogra. 
phy, history, criticism, etc. 

8.3 He attends literary meetings devoted to reviews, 
criticisms, discussions, etc. | 


Desire to express one’s self creatively 
41 He produces, or at least undertakes to produce, "а 


creative product more or less after the manner of 
the thing appreciated. 


42 He writes critical appreciations. 
43 Не illustrates what he has read in some one of the 


graphic, spatial, musical, or dramatic arts. 


- Identification of one’s self with the thing appreciated 


5.1 He accepts, at least while he is reading, the persons, 
places, situations, events, etc., as real. 
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5.2 He dramatizes, formally or informally, various pas- 
sages. 
5.8 He imitates, consciously and unconsciously, the 
speech and actions of various characters in the story. 
6. Desire to clarify one’s own thinking with regard to the life 
problems raised by the thing appreciated 
6.1 He attempts to state, either orally or in writing, his 
own ideas, feelings, or information concerning the 
life problems with which his reading deals. 
6.2 He examines other sources for more information 
about these problems. 
6.8 He reads other works dealing with similar problems. 
7. Desire to evaluate the thing appreciated 
71 He points out, both orally and in writing, the ele- 
ments which in his opinion make it good literature. 
7.2 He explains how certain unacceptable elements (if 
any) could be improved. 
7.3 He consults published criticisms. 


The committee next suggested that one method of securing 
evidence of these seven types of response in secondary 
schools would be to ask students to report on these be- 
haviors themselves. The advantage of asking students to 
observe themselves and to record these observations, as com- 
pared with the collection of anecdotal records or the use of 
interviews, is primarily one of practicability. The committee 
also recognized that the use of a questionnaire technique 
demands that certain assumptions be fulfilled if the method 
is to give valid evidence. Most importànt among these as- 
sumptions are: (1) that the overt behaviors and their accom- 
panying situations specified in the items are significant evi- 
dence of the seven types of behavior; (2) that the students 
are capable of observing these overt behaviors, of remember- 
ing them, and of recording them; (3) that the students are 
honest in their responses to each item. The extent to which 
these assumptions actually are fulfilled will depend upon 
both the characteristics of the questionnaire itself and the ' 
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Situation in which the student is asked to respond to the 
questionnaire. First, let us review the construction of one of 
these three questionnaires, pointing out the criteria in its 
construction which were made necessary by these assump- 
tions; later we shall consider the administration of such an 
instrument and the conditions under which its use is most apt 
to give valid evidence. 


Questionnaire on Voluntary Reading 

Of the three appreciation questionnaires— The Novel 
Questionnaire, The Drama Questionnaire, and The Question- 
naire on V oluntary Reading—which were developed during 
the period of the Eight-Year Study, The Questionnaire on 
Voluntary Reading was used and studied most extensively; 
for this reason it will be chosen to illustrate the construction 
of an instrument to measure students’ responses to their 
reading, This questionnaire was designed to measure the ex- 
tent to which students exhibit the seven types of response * 
to their “free” or voluntary reading of books. The directions 
to the student on the questionnaire read in part as follows: 


QUESTIONNAIRE ON VOLUNTARY READING 


Directions to the Student 


The purpose of this questionnaire is to discover what you really 
think about the reading which you do in your leisure time. Alto- 
Бећег there are one hundred questions. Consider each question 
carefully and answer it as honestly and as frankly as you pos- 
Sibly can, There are no “right” answers as such. It is not expectéd 
aat yonr own thoughts or feelings or activities relating to books 
shouid be like those of anyone else. 
The numbers on your Answer Sheet correspond to the numbers 
of the questions on the questionnaire. There are three ways to 


Mark the Answer Sheet: 


A—means that your answer to the question is Yes. 
U—means that your answer to the question is Uncertain. 
D—means that your answer to the question is No. 
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If it is at all possible, answer the questions by Yes or No. You 
should mark a question Uncertain only if you are unable to an- 
swer either Yes or No. 


Please answer every question 


One hundred questions which the student is asked to an- 
swer make up the items of the questionnaire. An illustrative 


set of items, grouped under the seven types of response; 
follows: 


“Derives satisfaction from reading" 
l. Is it unusual for you, of your own accord, to spend a 
whole afternoon or evening reading a book? 
2. Do you ever read plays, apart from school requirements? 
^Wants to read more" 
1. Do you have in mind one or two books which you would 
like to read sometime soon? 
2. Do you wish that you had more time to devote to reading? 
“Identifies himself with his reading” 
1. Have you ever tried to become in some respects like a 
character whom you have read about and admired? 
2. Is it very unusual for you to become sad or depressed 
over the fate of a character? 
“Becomes curious about his reading” 
1. Do you read the book review sections of magazines or 
newspapers fairly regularly? 
2. Do you ever read, apart from school requirements, books 
or articles about English or American literature? 
“Expresses himself creatively” 
1. Have you ever wanted to act out ‘a scene from a book 
which you have read? 
2. Has your reading of books ever stimulated you to attempt 
any original writing of your own? 
“Evaluates his reading” 
1. Do you ordinarily read a book without giving much 
thought to the quality of its style? 


9In the questionnaire itself, the items are ungrouped; they are, how- 
ever, readily classified by use of the scoring key. 
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2. Do you ever consult published criticisms of any of the 

books which you read? 
“Relates his reading to life” 

1. Has your attitude toward war or patriotism been changed 
by books which you have read? 

2. Is it very unusual for you to gain from your reading of 
books a better understanding of some of the problems 
which people face in their everyday living? 


It will be observed that this statement of the seven types of 
behavior differs somewhat from that given on pages 251 and 
252. The major purpose of this rewording was to place the 
emphasis, for several of these types of behavior, on what 
Students actually do rather than on what they desire to do. 

The first criterion that the items included in the question- 
naire had to satisfy was that they must deal with behaviors 
Which were judged by teachers who prepared and used the 
Questionnaire to be significant evidence of the seven types 
ОЁ response to reading. In a sense, then, the items constitute 
а definition, in terms of what students do and say, of what 
these teachers meant by “Derives satisfaction from reading," 
"Wants to read more,” etc. In order to insure that this cri- 
terion was satisfied, the items were drawn originally from 
the list of overt acts and verbal responses which the com- 
Mittee judged to be significant evidences of the seven types 
of response, Then, as use of the questionnaire in a number 
of Schools gave opportunity to secure from teachers addi- 
dues а-н the significance of these items, the ques- 

revised. 

Th Selecting and phrasing items it was necessary to con- 
Sider Several additional criteria. The assumption that stu- 

ents are capable of observing these overt behaviors in 
themselves, of remembering, and of recording them de- 
mands first of all that each item deal only with those be- 
laviors which secondary school students are apt to exhibit 
and only with situations in which students are apt to find 
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themselves. This is almost an obvious criterion, for if we 
expect the student to report on his behavior we must ask 
him questions about things he actually has an opportunity 
to do. The committee, in preparing the list of overt acts and 
verbal responses, and teachers, in judging the significance of 
items included in the early forms of the questionnaire, were 
asked to consider whether or not each of the specific acts or 
verbal responses is something which secondary school stu- 
dents are apt to do or say. It was possible later, by studying 
the responses of students to each item on the questionnaire, 
to check these judgments of teachers to some extent. Second, 
this assumption demands that each item deal with behavior 
and situations which the student is apt to remember. This 
criterion immediately rules out certain types of questions. In 
general, we would not expect students to remember, for 
example, exactly how many books they had read during the 
summer; yet we might expect them to remember whether or 
not they had read a book during the preceding week. In 
general, we would not expect them to remember the details 
of an argument with a friend about the merits of a particular 
book; yet we might expect them to remember having tried 
to defend their judgment of a book. Third, this assumption 
demands that any judgments or generalizations which the 
student is asked to formulate be relatively simple ones. An 
item which calls for an extensive introspection, for the rating 
of one’s self on an abstract and undefined quality, for mak- 
ing fine distinctions between causes or effects, etc., thus 
would be ruled out. Fourth, this assumption demands that 
each question be so phrased that it is readily understood by 
the student and can be answered with a minimum of writing. 
That the question must be understood if he is to answer it 
intelligently is obvious. That his ability to express himself in 
writing may become a factor which, for this test, may inap- 
propriately condition the evidence and the judgments made 
from the evidence, was also recognized. The selection of 
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Yes, No, and Uncertain as the particular pattern of “con- 
trolled response” for the questionnaires eliminated the neces- 
sity of the student's writing out his answers, but made it 
necessary that each question be so phrased that it could be 
answered with one of the three responses provided. 

The assumption that students are honest in their responses 
also suggests criteria which each item must meet. Certain ac- 
tivities and certain situations may have such a "prestige" 
value that questions dealing with them would tempt the 
Student to say that he took part in them, whether he actually 
did or not. Questions dealing with any activity which is ordi- 
narily participated in because of its "social" value thus were 
Tuled out, as were all questions dealing with activities in 
which participation might be dependent primarily upon an 
economic factor. Likewise, items which deal with activities 
Ог situations, the disclosure of which might threaten the 
Student’s sense of security, may tempt him to disavow actual 
Participation in these activities or situations. Questions which 
asked students to admit the reading of certain kinds of ma- 
terials Which are commonly frowned upon, such as comic 
Magazines, or to disclose any of his more intimate feelings or 
relationships with other persons also were ruled out. The final 
Criterion for the selection of the items, then, is that they deal 
Only with overt acts and verbal responses which the student 
Might be expected to report honestly. 


Summarizing and Scoring the Questionnaire 
on Voluntary Reading’ 

Several forms of the Questionnaire on Voluntary Reading 
Were Prepared during the period of the Eight-Year Study; 
Comparison of these several forms reveals that (1) the items 
Included in Form 3.32 probably best meet the criteria out- 
med above, (2) the length of Form 3.32 probably is an 


Optimum for both practicability and reliability,’ (3) the 


* Statistical data on reliability are presented in the Append 
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method of summarizing Form 3.32 is statistically preferable. 
For these reasons, the form of the Questionnaire on Volun- 
tary Reading which is recommended for use is Form 3.82. 

Form 3.82 is made up of the set of directions reprinted on 
page 258 and a list of 100 questions which students are asked 
to answer with one of three responses: Yes, No, or Uncertain. 
The responses to each of these 100 items are summarized 
under six categories: (1) Likes to read, (2) Identifies him- 
self with reading, (3) Becomes curious about reading, (4) 
Expresses himself creatively, (5) Evaluates his reading, 
(6) Relates his reading to life. Originally, seven categories 
were used for summary of the scores on the questionnaire, 
but study of the students’ responses revealed that scores on 
the categories “Derives satisfaction from reading” and “Wants 
to read more” are so closely related statistically as to warrant 
their being consolidated under one heading, “Likes to read.” 
On page 259 there is presented a sample of the data sheet 
on which the scores made by individual students on Form 
8.32 are reported. The scores of five students are presented 
for purposes of illustration. At the bottom of the data sheet 
appear the maximum possible score for each column, and 
the highest, the lowest, and the median score for each column 
computed for the class from which these five students were 
selected. All the scores on the data sheet are expressed as per 
cents; for example, the scores in column one are per cents of 
the 35 responses which are grouped under the heading 
“Likes to read." 

Three scores are available for each of the categories: an 
"Appreciation" score, a “Non-appreciation” score, and an 
“Uncertain” score. For each category the “Appreciation” 
score summarizes the responses which indicate that the stu- 
dent engages in those behaviors which are regarded as sig- 
nificant evidence of that type of behavior; the “Non-apprecia- 
tion” score summarizes the responses which indicate that the 
student does not engage in those behaviors; and the “Uncer- 
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tain” score gives the proportion of items which he was un- 
able to answer with either Yes or No. In addition to these 
scores for each of the six categories, total “Appreciation,” 
total “Non-appreciation,” and total “Uncertain” scores may 
be computed. These total scores summarize the responses to 
all the 100 items of the questionnaire and are analogous to 
the “single score” given by many tests. 

An explanation of the scores made by these five students 
follows: 


Part I. Likes to Read 


Columns Column 1 gives the per cent of responses which reveal 

1,2,8 that the student likes to read. Column 2 gives the per 
cent of responses which reveal that he does not like to 
read. Column 3 gives the per cent of uncertain re- 
sponses. A high score in column 1, accompanied by 
low scores in columns 2 and 8, indicates that the stu- 
dent likes to read to a great extent. Student A, for 
example, has such a score. Low scores in columns 1 
and 8, accompanied by a high score in column 2, indi- 
cate that the student dislikes reading. Among these 
five students, Student E has the highest score in column 
2; however, reference to the line marked “High Score" 
reveals that his score in column 2 is not the highest in 
this class. A high score in column 3, such as that of 
Student D, indicates that the student was somewhat 
uncertain in answering the questions grouped under 
this heading. 


Part IIA. Identifies 


Columns These scores indicate the extent to which the student 
5,6,7 identifies himself with his reading. Among these five 
students, Students A and C have relatively high “Ap- 
preciation” scores on this category (column 5) and 
zero “Non-appreciation” scores (column 6). Such 
scores indicate that the student identifies himself with 
his reading to a considerable extent. Student E has the 


Columns 


9, 10, 11 


Columns 
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highest “Non-appreciation” score on this category, 
both among these students and among the class as a 
whole. Student D has a high “Uncertain” score (col- 
umn 7). 


Part IIB. Curious 


These scores indicate the extent to which students are 
curious about their reading, Students A and C have 
high “Appreciation” scores (column 9) and low “Non- 
appreciation” scores (column 10). This pattern indi- 
cates that these students respond to their voluntary 
reading by wanting to know more about authors, 
books, literary periods, etc. Students D and E prob- 
ably do not respond in this fashion, for they have low 
scores in column 9 and very high scores in column 10. 
Column 1l gives the per cent of responses marked 


“Uncertain.” 


Part IIC. Expresses 
These scores indicate the extent to which the student 


18,14, 15 expresses himself creatively as a response to his read- 


Columns 
17, 18, 19 


ing. The highest “Appreciation” score (column 13) in 
this class is 100; none of these five students has such 
a high score in column 18; Students A and C are some- 
what above the median of the class (50), and Student 
B is at the median. Probably none of these five stu- 
dents expresses himself creatively to a very great ex- 
tent. Student E, with his high “Non-appreciation” 
score (column 14), probably rarely engages in such 
activities аб creative writing, painting, dramatizing: 
etc. Student D is characterized by a very high "Uncer- 
tain" score (column 15). 


Part IID. Evaluates 
These scores indicate the extent to which the student 
evaluates or judges his reading. Students B and C have 
hig! “Appreciation” scores (column 17) and low “Non- 
appreciation” scores (column 18); this pattern indi- 
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Columns 
21, 22, 23 


Columns 
25, 26, 27 


cates that they tend to evaluate their reading to a very 
great extent. Student A has a low score in column 17, 
as compared with the median, and his “Uncertain” 
score (column 19) is rather high. This pattern differs 
considerably from the pattern of his scores on the pre- 
ceding categories, and it suggests as an hypothesis 
that his greatest weakness may be a failure to engage 
in such activities as reading reviews and criticisms, 
attempting to make judgments about what he reads, 
etc. 


Part II. Total 


These three scores represent the totals of the scores in 
the four preceding categories and are reported pri- 
marily to provide measures whose reliabilities are 
comparable to those of the scores on Parts I and III. 
For the group of responses included in Part II, student 
C has a relatively high "Appreciation" score (column 
21) and relatively low "Non-appreciation" (column 
22) and "Uncertain" (column 23) scores. In diagnos- 
ing the specific differences between him and Student 
A, for example, it is necessary to refer to the four pre- 
ceding categories. Student D has the lowest "Apprecia- 
tion" score and the highest "Uncertain" score on Part 
II; Student E has the highest ^Non-appreciation" score. 


Part III. Relates to Life 


These scores indicate the extent to which the student 
relates his reading to his life and to the problems which 
he recognizes as existing. A high "Appreciation" score 
(column 25), such as that of Student C, indicates that 
he relates his reading to life, as he knows it, to a con- 
siderable extent. Student E has a high “Non-apprecia- 
tion" score (column 26), in fact almost the highest in 
the class. Probably he does not relate his reading to 
life to any great extent. Students A and D have rather 
high “Uncertain” scores (column 27). 


- 


APPRAISING STUDENT PROGRESS 263 


Total Score 

Columns These scores are convenient for making a summarizing 

80, 31,32 judgment of a student's responses to the test; however, 
they necessarily obscure some of the differences among 
students on various categories. The "Appreciation" 
score (column 30) gives the number of the student's 
responses to the one hundred items of the test which 
reveal these seven reactions to reading; the "Non- 
appreciation" score (column 31) gives the number of 
his responses which reveal that he does not react to 
reading in these seven ways, and the "Uncertain" score 
(column 32) gives the number of his uncertain re- 
sponses. 


Several rather commonly occurring patterns are revealed 
by the scores of these students. A set of scores which reveals 
that the student responds to his reading to a considerable 
extent in these seven ways is illustrated by that of Student C. 
Nearly all his “ Appreciation” scores are relatively high and 
his “Non-appreciation” and “Uncertain” scores relatively low. 
Almost the opposite pattern is revealed by the scores of Stu- 
dent E: relatively low “Appreciation” scores and relatively 
high “Non-appreciation” scores. The relatively high “Uncer- 
tain” scores of Student D reveal that, despite the instruc- 
tions to answer the questions with Yes or No if it were at all 
Possible, he answered a large number of the questions with 

"certain, Several hypotheses might be advanced to account 

or this: He may have been quite indifferent to the test and 
have marked almost at random; he may have been extremely 


Overcautious" or scrupulous in attempting to answer the 
e been unable to answer many of these 


had failed previously to observe such 
er study of other data about this 
firm or deny these hy- 
nterpretation of such 


questions; he may hav 
questions because he 

ehaviors in himself. Furth 
Student would be necessary to confirm 
Potheses and to arrive at a satisfactory 1 


264 ADVENTURE IN AMERICAN EDUCATION 


a pattern of scores. The scores of Student “ indicate a stu- 
dent who likes to read very much yet does not evaluate his 
reading to any great extent. His relatively high “Uncertain” 
scores on Part IID and Part III should be used as a starting 
point for hypotheses as to why he responded in this fashion 
only to these two categories. 


Other Instruments 

Two questionnaires, similar in structure to the Question- 
naire on Voluntary Reading, were developed for the purpose 
of measuring students’ responses to a particular novel or a 
particular drama which they have read. The Novel Question- 
naire (Test 3.22) includes 65 items, the responses to which 
are summarized under the same six categories as are the re- 
sponses to Form 3.32. Similar scores are computed for each 
of the six categories, and for the total of 65 items. The Drama 
Questionnaire (Test 3.21) includes 80 questions, the re- 
sponses to which are summarized under the six headings 
mentioned above plus an additional heading: “Feels that he 
understands the play.” This category was added to the Drama 
Questionnaire in order to aid in the interpretation of scores 
on the six categories. It was believed that the extent to which 
a student feels that he understands the play he has read may 
demand differing interpretations of his other responses. For 
example, a pattern of scores which indicates that a student 
derived no satisfaction from reading the play yet felt that he 
understood it perfectly probably would demand a different 
interpretation from one which indicates' that the student de- 
rived no satisfaction from reading the play and felt that he 
did not understand it. A similar category has not been added 
to the Novel Questionnaire; it is possible that teachers using 
the Novel Questionnaire would find such an addition helpful. 

Each of the three questionnaires described includes, as 
has been indicated, a set of items the responses to which are 
summarized under the heading, “Evaluates his reading.” The 
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Purpose of this category is to discover to what extent stu- 
dents actually engage in such activities as comparing the 
merits of one book with those of another, discovering what 
Critics have said about books they have read, comparing 
their judgments of books with those made by others, ete. 
Scores on this category obviously do not furnish information 
about the quality of the judgments which the student makes 
of books, just as scores on the category “Likes to read” do 
1t furnish information about the quality of the books which 
Ne actually reads. Because a number of teachers wished to 
‘ave some objective means of appraising the quality of stu- 
onts judgments, this evaluation problem was explored. 
hree experimental instruments were developed; these are: 
An Interpretation of Literature (Test 3.1), Critical-Minded- 
Ness in the Reading of Fiction (Test 3.7), Judging the 
®ctiveness of Written Composition (Test 38). Because 
18е instruments have not been used extensively or studied 
Su Ciently, they are not as yet to be recommended for wide- 
Spread use, However. they might serve as useful classroom 
a Teises and they might suggest techniques for appraising 
Udents? judgments which others would want to utilize. 
Aese three tests use short stories as their content or 
ми Ject-matter, In brief, they were constructed by Sg та 
sto * group of students to write out any cigs не the 
TY which they could or would care to ma e. er these 
Judgments had been sorted and the duplicating ones dis- 
arded they were submitted to a jury of teachers. The jury 
5 У were submi 


Brou "good" or a “poor” judg: 
Ped tl Arked each as а 500 Е 
пет and marked ea ncluding the story and 


Ment i 
. Th ade up, i 
Шат. test was then ma | к 
he lis of students’ judgments, and those who took the test 
| udents’ judg and respond to each of the 


"is irected to read the story it disagreeing with it, or 

Stage ents listed by agreeing with it, e" г нет The eval- 

w § that they could neither agree nor у is wed as a test 

koy ок а cachsjudgment made by the pe im to which the 
` Scores are given in terms of the € 
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student evaluated these judgments as did the jury. It should 
be pointed out that this is only one method of scoring re- 
sponses on such a test. Other methods might be devised 
which would better suit the purposes of particular schools or 
teachers. 

Test 3.1, An Interpretation of Literature, is based on 
O. Henry’s story “A Municipal Report.” The student is asked, 
after reading the story, to respond to statements which are 
grouped under such headings as: 


. What is your interpretation of the story? 

. What was O. Henry’s point of view? 

. What was O. Henry's philosophy? 

. What was the character's motive? 

. Which is the most logical ending for the story? 


OUR ONDE 


Scores for each of these parts may be computed. 

Test 3.7, Critical-Mindedness in the Reading of Fiction, 
makes use of two short-short stories reprinted from a popu- 
lar magazine. The statements which follow each of these 
stories deal with the extent to which the actions and speech 
of these characters, the description given by the authors, the 
outcomes of the stories, etc., are “true to life.” For example, 
these statements follow the story "First Acquaintance" by 
I. A. В. Wylie: 


l. The general atmosphere—the smells, the signs on the 
door, the moving nurses, etc.—is depicted accurately in 
this story. 

. It seems scarcely likely that a young man would wonder 
about the *No visitors" sign, the oxygen tank, and the sick 
mother and daughter as the youth in this story did. 

8. Under the circumstances it seems natural for the youth to 

say "Gosh" and "That's tough" several times. 

4. No nurse, even a young one, would volunteer as much 
information about patients to a stranger as the nurse in 
this story does. s 

5. The youth’s sudden realization of what death means and 
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his thoughts about his own mother seem real and natural. 

6. The suggestion that the youth was crying when he left 
the hospital is difficult to believe. 

7. The emphasis upon the fact that the mother and daughter 
were alone in the world seems exaggerated and over done. 

8. Under the circumstances it seems natural for the young 
man, on his return to the hospital the next morning, to be 
more concerned to find out about the condition of the 
sick girl's mother than of that of his sister. 

9. The action of the young man in going into the girl's room 
to tell her that she had not been left completely alone is in 
accordance with what the reader has previously found out 
about his character. 

10. The sick girl's response to his sympathy does not seem 
true to life. 


Four scores are given on this test: (1) “Judicious,” i.e., the 
extent to which the student's responses agree with the jury's 
judgment; (2) “Hypercritical,” i.e., the extent to which the 
Student judges situations which the jury believes are true to 
life to be not true to life; (3) “Uncritical,” i.e., the extent to 
which the student judges situations which the jury believes 
are not true to life to be true to life; (4) “Uncertain,” i.e., 
the extent to which the student was unable to agree or dis- 
agree with these statements. 

р Test 8.8, Judging the Effectiveness of Written Composi- 
tion, makes use of a short-short story written by a high 
School student. This story is followed by 28 statements about 
the narrative quality, the style, the characterization, etc., of 
nis Story. For example, these statements are included: 


l. The writer should not have included so many different 
episodes in one brief story. 
8. The writer shows considerable skill in depicting the hu- 
morous aspects of situations. 
- The diaiog in the story is, in general, handled ably. 
. Esmond's stammering, hesitant way of speaking in trying 
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situations helps the reader to see him as an individualized 
character. 

6. The concluding episode provides a very effective climax 
for the story. 

7. Esmond is a good name for the chief character in the 
story. 


This test is also scored by comparing the student's responses 
with those provided by a jury of adults. 


Validity of the Questionnaires 

In order to assess the value of the instruments designed to 
measure students’ responses to their reading it will be neces- 
sary to consider their validity, their reliability, and the uses 
which classroom teachers may make of them. It was pointed 
out earlier that the validity of the questionnaire technique 
for measuring students’ responses to their reading is pri- 
marily dependent upon the extent to which three major as- 
sumptions are fulfilled; it was also pointed out that whether 
or not these assumptions are fulfilled will depend upon both 
the nature of the instrument and the conditions under which 
it is administered. The construction of one of the question- 
naires has been described in some detail in order to illustrate 
how certain criteria which were demanded by these three 
assumptions were applied. If these criteria are judged to be 
adequate and the items of the questionnaire meet the cri- 
teria, then the instrument is one which is so constructed a$ 
to make possible the collection of valid evidence of the seven 
types of response to reading. i 

Valid evidence of these types of response, however, may 
not be given by the questionnaire even though its construc- 
tion is judged to be satisfactory. Obviously, if such an instru- 
ment as Form 3.32 were administered as a “final examina- 
tion” and the students informed that their grades or credits 
would be determined by their scores, we would not expect 
it to yield valid evidence of those students’ responses to their 
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voluntary reading. The conditions which should attend the 
administration of one of these questionnaires are as follows: 
First, the teacher should understand the kinds of evidence 
the questionnaire is designed to give and should desire to 
secure this evidence. Second, the teacher should have a cur- 
riculum program which might be expected to bring about 
the development of the seven types of response. Third, the 
teacher should have developed a rapport with the students 
which will enable and encourage them to respond honestly 
to the questions. Fourth, the students should understand and 
accept the purpose of the administration of the questionnaire 
and the uses which are to be made of the results. This is 
merely to say that an evaluation instrument must be under- 
stood, must be relevant to the objectives and the curriculum, 
and must be accepted by the students as an opportunity to 
appraise themselves, if its use is to be of greatest value. 

The assumption that students will respond honestly is a 
crucial one in these questionnaires, and unless it is fulfilled 
We cannot hope for valid evidence. In the construction of the 
questionnaire an attempt was made to select items which 
Would not tempt students to be dishonest in their responses, 
and the directions were so phrased as to emphasize the de- 
sirability of answering as frankly and as honestly as possible. 

lese were efforts to aid in securing honest responses. How- 
ever, these efforts cannot be expected to make certain that 
the assumption will be fulfilled. The degree of rapport be- 
tween teacher and students, students’ previous experiences 
With “tests” and with the uses of test results, and students’ 
Concepts of the purposes of education and of the place of 
evaluation in education may determine to what extent the 
responses will be honest ones. 

The questionnaire technique which is used in these instru- 
ments differs from the method of direct observation of stu- 

ents by a teacher only in that the student is both subject 
and observer rather than being merely the subject. One 
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method, then, of checking the honesty of a student’s re- 
sponses to the questionnaire would be to compare his re- 
sponses with observations made by one or more adults of 
what he actually does and says. It should be possible for one 
familiar with the overt acts and verbal responses included in 
the questionnaire to compare his observations of some of 
these behaviors with student’s responses. For example, а 
teacher might provide periods for “free-reading” and during 
those periods determine to what extent the student welcomes 
interruptions of his reading, reads various types of fiction 
and nonfiction, reads attentively, etc. Also, in conversation 
with a student, a teacher could secure evidence which would 
help her judge to what extent certain wishes and feelings 
expressed in his responses to the questionnaire were genuine. 
This is one method of validating responses to the question- 
naire. 

A somewhat different method which might be used would 
be to interview a student about his reading behaviors and in 
addition to asking him what he does, ask him for illustrations 
or examples of these behaviors. For example, a teacher who 
wished to know whether or not a student reads book reviews 
in current publications rather regularly probably could dis- 
cover this without attempting to observe such reading di- 
rectly. By asking him whether or not he ever read book 
reviews and, if his reply were yes, following this by asking 
in what publications he read them and what reviews he had 
read recently, and by giving him an opportunity to discuss 
some of these reviews, she could be reasonably certain of 
whether or not he actually did such reading. Such a pro- 
cedure, of course, need not be an inquisition nor need it 
result in only an answer to the teacher’s question. Reading 
guidance might be given as well as reading behaviors ар" 
praised in the same conversation. | 

Recognition of this method as a means of achieving rea- 
sonable certainty about what students actually do and say 
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leads to the possibility of constructing a paper and pencil 
instrument which would achieve a similar result. The stu- 
dent might be asked to respond on paper to questions about 
his reading behavior and then write out an illustration or an 
example of each behavior. The nature of the illustration or 
example presumably would be evidence which would tend 
to substantiate or refute his contention that he engaged in 
such behaviors. Let us for convenience call'this a “direct 
form” of the questionnaire. The first page of such a direct 
form is reprinted below. 


Name Age Sex 

Grade уре со оу ЕРИНИН 
This is not a “test” but an attempt to discover more about your 
reading interests. Obviously, no two persons have exactly the 
Same reading interests; consequently there are no “right” or 
“wrong” answers, as such, to these questions. 


Please answer each question as carefully and as honestly as you 
can. Mark your answer to each question by checking the space 
under Yes, No, or Uncertain at the right of the sheet. If your 
answer to a question is Yes, please give the additional information 
asked for in the question. If your answer is No or Uncertain, go 
9n to the next question. 


Uncer- 
Yes No tain 
1. Do you have in mind one or two books 
Which you would like to read?........ — 
lf you do, please give the author and 
title of one: ^ 


Do you ever read adventure novels in 

your Spare Pimek scenerii iida 25588 
If you do, please give the author and 

title of one which you have read: 


Do you ever read essays, apart from 
School requirements? socre ceesre sos а 


` 
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If you do, please give the author and 
title of one which you have read: 


4, Is there any author whom you like so 
well that you would like to read any new 
book he might write?................5 — LL — 
If there is, please give his name and the 
title of one of his books which you have 
read: 


5. Do you ever of your own accord read 
humorous stories or books of satire?.... 
If you do, please give the author and 
title of one which you have read: 


6. Do you ever read biography, apart from 
school requirements? ................ єс — 
If you do, please give the author and 
title of one which you have read: 


Such "direct forms" of the questionnaire have been used 
in studying the functioning of the Questionnaire on Volun- 
tary Reading. The methods and the results of these studies 
will be reported in full in a forthcoming monograph. In brief, 
we find, for some classes, a relatively high relationship be- 
tween responses on the Questionnaire on Voluntary Reading 
and on a direct form. These relationships, expressed as 
product-moment correlation coefficients, range from .38 to 
79.5 Other types of studies which make use of interview 
techniques and of comparison of teachers’ ratings of students 
with test scores will also be reported in the monograph. 
Similar studies of students’ responses to the Novel and Drama 
Questionnaires have not been made; the presumption would 

8 Fourteen such coefficients derived from a study of Form 3.32 are dis- 


tributed as follows: .35 to .40, one; .45 to .50, one; .60 to .65, two; :65 


to .70, three; .70 to .75, three; .75 to .80, four. The median of this distribu- 
tion is .695. 
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be, since the basic technique is similar to that of Form 3.32, 
that such studies would yield results much like these. Tests 
3.1, 3.7, and 3.8 were described as experimental instruments 
and the fact that they have not been studied has been men- 
tioned. 


Uses of the Instruments 

Two major uses of the instruments described in this section 
may be pointed out: (1) To provide information about 
students which will aid in planning the school program and 
in guiding students; (2) To provide evidence on which can 
be based an appraisal of the progress of students and of the 
effectiveness of the school program. Before instruments such 
as the questionnaires described here are used, however, it is 
important for the teacher to examine the instruments care- 
fully and to satisfy herself that they deal with behaviors 
which she regards as important. When such instruments are 
used, it is also important to recognize the limitations in- 
erent in them and to supplement the evidence given by 
them with evidence gained from classroom observation and 
Tom other instruments. In interpreting scores on these in- 
Struments, it is important to consider the reliability data 
Which are furnished in the Appendix and to use caution in 
making judgments based on differences in scores, either be- 
‘ween individuals or groups. 

The kinds of information given by these instruments have 
een described above. Such information as that given by the 

Uestionnaire on Voluntary Reading should be of use to æ 
teacher early in the school year to aid her in becoming ac- 
quainted with some of the reading behaviors of her students. 

or example, a teacher might profitably make use of the in- 
ormation that certain students or certain groups of students 
make very low "Appreciation" scores on the category “Likes 
to read,” Assuming that a favorable attitude toward the 
reading of books is of some importance, either as an end in 
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itself or as a means to other ends, the teacher might plan 
special classroom experiences which would help these stu- 
dents to overcome the unfavorable attitude and to develop 
a favorable attitude toward books. In planning these experi- 
ences the question of why these students do not seem to like 
to read would necessarily be raised. In order to answer this 
question a number of hypotheses would have to be explored. 
Here the teacher would want to make use of evidence from 
other tests, such as tests of reading comprehension, from 
classroom observations made by other teachers, and from the 
school and home records of these students. 

Such exploration of hypotheses might lead the teacher to 
give special attention to the reading behaviors of certain stu- 
dents as well as of the class as a whole. In planning reading 
experiences for individual students she also might find scores 
on the questionnaire helpful. For example, discovery of a 
student with a high “Appreciation” score on the category 
“Likes to read” but with relatively low “Appreciation” scores 
on the other categories might prompt the teacher to help the 
student discover and participate in such reactions as evaluat- 
ing reading or relating it to life. Teachers have found that 
a conference early in the year with individual students which 
begins with the consideration of test scores may lead to an 
enthusiastic planning of individual programs of reading and 
other activities by the students themselves. In such confer- 
ences, of course, test scores should not be regarded as “marks” 
or judgments but instead as evidence which should be con- 
sidered in planning the work of the year. 

The second use is that of providing evidence on which 
appraisals may be based. Evidence of change from year to 
year in the status of individual students in their reactions to 
voluntary reading should be given by such an instrument as 
the Questionnaire on Voluntary Reading. This evidence 
should be useful to the student who wishes to make ар ар" 
praisal of his achievement, to parents who wish to appraise 
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the progress of their children toward goals such as develop- 
ing a favorable attitude toward voluntary reading, and to 
teachers who wish to appraise the success of their guidance 
and instruction in aiding students to cultivate some of these 
responses to reading. The appraisal of their own achievement 
by students is probably a necessary concomitant in any plan 
of promoting student as well as teacher planning of the edu- 
cational program. Such appraisal, in turn, should stimulate 
further planning by both teacher and student. When the in- 
terest of parents in the success of their children demands 
more than a summarizing mark, a description of change in 
Status as revealed by test scores should provide useful evi- 
dence to supplement anecdotal records or comments of the 
teacher, Tt is important, of course, for those who interpret 
these scores to others to make sure that changes in test scores 
are not mere chance fluctuations, but are "significant" dif- 
erences, before interpreting them as such. 
The role of other instruments in aiding the teacher in plan- 
ning or in appraising her program should not be overlooked. 
et us recall the three questions which members of the 
Committee on the Evaluation of Reading wished to be able 
to answer; namely, (1) How well does the student read? 
(2) What does the student read? and (3) How does the 
Student react to his reading? An answer to the first question 
may be needed to help explain why a student does not read, 
of his own accord, or does not like to read. An answer to the 
Second question may be needed to help explain why a stu- 
ent does not relate his reading to life. Thus in establishing 
.potheses about the causes of certain students’ difficulties 
m responding to reading it may be necessary to make use of 
Several instruments which were designed to measure some- 
what different behaviors. On the basis of such hypotheses, 
educational programs which are relevant to the particular 
Deeds of the student or group of students may be planned. 
n appraising the program it may be desirable to make use 
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of several instruments again in order to determine to what 
extent each of these behaviors has been modified. Conse- 
quently the use of such an instrument as the Questionnaire 
on Voluntary Reading may not be a sufficient evaluation 
procedure in itself. Those who wish to develop a more com- 
prehensive plan of evaluation of reading behaviors should 
find the description of the instruments designed to help de- 
termine how a student reads and what he reads pertinent to 


their needs. These descriptions appear on pages 319 to 
337. 


Tue EVALUATION OF THE APPRECIATION OF ART 


The Committee on Evaluation in the Arts, composed of 
art teachers in the schools of the Eight-Year Study, listed as 
purposes of art teaching the following: (1) objectives per- 
taining to the development of sensitivity to art values, com- 
monly called appreciation; (2) objectives related to the 
development of the ability to express certain types of experi- 
ences creatively; and (3) objectives related to emotional 
adjustment resulting from the release afforded by creative 
experience. 

The evaluation of the first of these objectives—the devel- 
opment of sensitivity to art values—is the one with which 
the staff has been primarily concerned. Emotional adjust- 
ment can be fostered by means of well directed creative 
experience in the arts but the question of which are the 
particular types of emotional problems that can be solved, 
аз well as the question of which kinds of creative experience 
offer a remedy for a particular emotional problem, is as yet 
not definitely answered." So it was felt that the primary con- 
sideration was the evaluation of sensitivity to art values and, 
although some attention was devoted to the emotional con- 
notations, the results are not as yet sufficiently established to 


9 The more important literature concerning this problem is cited in Levey: 
Harry, "A Theory Concerning Free Creation in the Inventive Arts," Psychi- 
atry, III (May, 1940), p. 229 ff. 
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warrant extensive discussion. Furthermore, the area of per- 
sonal and social adjustment was being explored separately 
(cf. Chapter VI); consequently only casual remarks on this 
aspect of the objective will be made in the following pages. 
The problem of evaluating sensitivity to art values was 
further narrowed to include only the field of the visual arts. 
Here again it seemed unnecessary to duplicate work done in 
other areas. The evaluation of the appreciation of literature 
is discussed in the preceding section; other instruments of 
evaluation of appreciation in the field of the arts will be dis- 
cussed on page 307. Thus the task became one of developing 
evaluation instruments which would appraise the students’ 
Sensitivity to art values in the field of the visual arts. 


Ways of Getting Evidence and Exploration of 
Possible Criteria for a New Instrument 

The first step in the study of the problem was to survey 
currently used methods of getting evidence regarding art 
experiences and art appreciation of students. Some of the 
methods which have been used to discover the development 
of the subject’s knowledge regarding art—his intellectual 
understanding of art—include art questionnaires, art vocab- 
ulary tests, and similar instruments. These tests have at- 
tempted to appraise primarily the extent to which the student 
is familiar with art history and art techniques. Other tests 
have attempted to obtain an appraisal of the extent to which 
the student is able to apply certain rules of color-combination, 
balance, etc., in dealing with art objects. The success of the 
Student on all of these tests seems to be chiefly dependent 
upon the extent to which he has mastered a body of factual 
knowledge which may be helpful in bringing about an 
esthetic experience. 

Another approach to evaluation in the arts is through tests 
which attempt to measure the extent of the subject's interest 
in art and to discover in which sub-fields he has a special 
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interest. Still another method of gathering evidence regard- 
ing art experience has been to rely on a student’s opinion 
about these experiences. His opinions may be stated in essay 
form or they may be expressed as responses to a checklist. 
More informal methods frequently employed by teachers in- 
clude anecdotal records about student behavior, collections, 
descriptions, or photographs of creative work, and checklists 
filled out by teachers. The advantages and disadvantages of 
all these methods were reviewed in an attempt to set up 
criteria for an instrument designed to appraise responses to 
art values. 

First of all, it was thought that tests of intellectual under- 
standing, of mastery of specific areas of information, while 
useful where information is a part of the objective, would not 
necessarily contribute to an appraisal of the art sensitivity of 
the subject. It was recognized that a student may be sensitive 
to art values even though he has not mastered a body of 
specific information or rules. The converse seems also to be 
true; that is, a student may be familiar with the meaning of 
technical terms, the facts of art history, and so on, without 
being responsive to artistic values. It seemed desirable, 
therefore, that an instrument of appraisal should be so con- 
structed that it would depend as little as possible upon the 
student’s previously amassed information regarding art. The 
fact that it would be extremely difficult to eliminate this 
element entirely was also recognized. 

, Even though written statements about art experiences 
have the advantage of being highly personal and, therefore, 
may give insight into the nature of the individual's reaction, 
they too have one important disadvantage—they are fre- 
quently unfair to the student who is relatively lacking in the 
ability to state his reaction in words. Tt should be recognized 
that not all students who are capable of genuine and deep 
art experience have correspondingly well developed verbal 
abilities. It is very likely, for instance, that some students 
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who have very little verbal facility find a means of expression 
in art.” Finally, there seem to be certain immediately visible 
qualities in an art object which are extremely difficult to 
translate into words, even for the verbally gitted person. 
Painting and prose are seldom mutually interchangeable as a 
means of expression. For these reasons it was thought desir- 
able to have the instrument depend as little as possible upon 
verbal expression of subjective reactions. Since it was recog- 
nized that it would not be possible to eliminate the verbal 
element entirely, the aim was to reduce it to a minimum. 

Records of behavior, anecdotal records, and collections of 
creative work, whereas they have the advantage of yielding 
evidence about the personal art experience of the individual, 
also have disadvantages. For instance, they do not provide a 
uniform basis for comparisons between students; also they 
apply only to the students who are productive in the studio; 
they fail if a student does not attend art classes. 

In summary it might be said that there seemed to be a 
need for a new instrument which, as far as possible, would 
be constructed in such a way as to satisfy the following cri- 
teria: (1) that the results should not depend primarily upon 
a body of factual knowledge; (2) that the results should not 
depend upon the ability to express art experience verbally; 
(3) that the responses should permit a comparison of differ- 
ent students on a uniform basis; and (4) that the instrument 
should permit the evaluation of the responses both of stu- 
dents who are known to be artistically creative and of those 
who have not as yet exhibited such talents. 

It was thought further that the instrument should attempt 
to get at the person’s reaction to a work of art as a unit or as 
a whole, rather than at reactions to specific, separate ele- 
ments of an object of art. It is doubtful whether one can get 


10 Moreover, it seems as if adolescents especially are reluctant to state 
their problems openly and verbally. To them the less obvious way of ex- 
pression by means of creation and participation in the arts is one of the 
main ways of dealing with these problems. 
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a valid indication of the capacity for esthetic experience 
evoked by an art object and what this object conveys, by 
asking a person to react separately to line, spatial arrange- 
ment, or color. Although this seems true for the evaluation 
of the esthetic experience as a whole, for the evaluation of 
certain aspects of esthetic capability a person’s response to 
certain specifics of an art object is also needed. This is par- 
ticularly true if the teacher wants to know at what particular 
stage of development the student’s reactions to certain known 
features of art may be. Two additional criteria, then, seemed 
necessary. First, the instrument should allow the student to 
react to the art object in an esthetic way and permit a re- 
sponse to the work of art as a whole; that is, to have as com- 
plete an art experience as possible. Second, the instrument 
should contain a variety of elements and evoke specific re- 
sponses so that the examination of these reactions of the 
student would permit an evaluation of his esthetic develop- 
ment with reference to these known elements. 


Some Remarks on the Psychology of Art Appreciation 


Before discussing in detail the specific assumptions under- 
lying the development of the instrument, some further re- 
marks concerning "art appreciation" should be made. Un- 
fortunately the connotations of this term vary in different 
contexts and no definition is generally accepted. Sometimes 
the term is used in a rather narrow sense, covering only a 
passive act on the part of the beholder who in this context 
is compared with a piece of wax that bears the impression of 
a seal. A recent theory recognizes a great deal more activity 
on the part of the beholder who is supposed in the act of 
"empathy" to neglect his own personality and to live in the 
world of the work of art for the span of time during which he 
is in “empathy.” A still more recent theory is that offered by 
"Gestalt" psychology. In dealing with these problems, from 
the point of view of this psychology, art appreciation is con- 
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sidered as a field phenomenon," the field consisting of the 
beholder and the work of art. The act of art experience сап 
take place—the field can be established—only if the spec- 
tator is willing to undergo the art experience. This willing- 
ness is a deliberate act on the part of the spectator, and art 
appreciation becomes an active rather than a passive reac- 
tion. In this connection it may be mentioned that for other 
and more elaborate reasons John Dewey" suggests that the 
term "art appreciation" may be discarded for the term "art 
experience," and the latter term implies activity on the part 
of the beholder. 

If art experience is conceived of as a field phenomenon, 
then the field will be strongly conditioned by the difference 
in the degree to which any one of the main elements con- 
Stituting the field governs it. One extreme would be a situa- 
tion in which the work of art dominates the field, a situation 
close to the one mentioned above in the example of the seal 
on wax. Fortunately this situation never occurs because even 
the most passive spectator is still a personality with a par- 
ticular background, particular education, particular opinions 
and feelings about art, which, even though he may be un- 
aware of them, will influence the field. The other extreme 
would be a situation in which the spectator dominates the 
field and is not touched at all by the work of art. It might be 
said that he is in a situation in which he is confronted with 
a work of art which he sees but does not experience. The 
ideal situation is a playing back and forth within the realm 
of the field, the spectator becoming more and more incited 
to bring new facets of his personality into play, and in tum 
becoming more aware of new facets of the work of art. Spec- 
tator and work of art may be said to be communicating with 
one another, a communication which is strongly conditioned 


А ™ See Ко Ка, "Psychology of Art,” Bryn Mawr Symposium on Art, p. 
224 ff. 


* See, for example, John Dewey’s recent volume, Art as Experience. 
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by the nature of both of them. The importance of the per- 
sonality of the spectator, his experience, and his emotional 
predispositions, may be corroborated by the well-known fact 
that at different times in life we experience works of art in 
different ways. i 

It thus becomes apparent that when one learns which 
aspects of a work of art are important for the art experience 
of an individual, access has been gained not only to his par- 
ticular way of experiencing art, but also to his personality. It 
is even more important to ascertain which works of art in- 
duce a spectator to have this personal experience—to learn 
which works of art incite him to establish this field phenom- 
enon called art experience. Moreover it is of interest to find 
out which works of art “leave him cold,” because they are to 
him void of meaning, or because they seem too unimportant 
to him to induce the amount of interest necessary for ex- 
periencing them. Again this will shed some light not only on 
the character of the spectator’s art experience but also on his 
personality, If something about the personality of a student 
can be learned by studying the environment which he cre- 
ates for himself, by exploring the kinds of persons he prefers 
to be with, or the kinds of persons that he avoids, then the 
type of pictures with which a person does or does not “com- 
municate” may be indicative not only of his art experience, 
but also of his personality. Finally, one wants to learn 
whether or not a person actually prefers the works of art with 
which he is able to communicate. 

The possible bearings of art experience on creativity in the 
field of art deserve comment. Obviously only the person who 
is able to experience in an esthetic way objects and events 
of the outer world, art objects as well as others, is able to 
express these esthetic experiences creatively. It was assumed 
that artists perhaps more than others are capable of having 
esthetic experiences with objects not yet molded into esthetic 
wholes. Moreover, during the process of expression or creation 
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the emerging product has to be evaluated by the artist in 
terms of his esthetic perception, in terms of the evolving 
product's suitability to induce or evoke art experiences in an 
ideal beholder.** Therefore it is to be expected that the art 
experience of the artist would not be essentially different, 
but only more highly and more intricately developed when 
compared with the art experience of the non-artist. It might 
also be expected that persons whose art experience is highly 
developed need not be or become artists, either because of 
lack of skills or because of other reasons. On the other hand, 
one would expect the artist’s art experience to be of the high- 
est quality. Moreover, the person who demonstrates a high 
degree of esthetic sensitivity in relation to the extent of his 
art experience may be a latent or future artist. 

Although the above remarks are not adequate for covering 
the topic with which they deal, it seemed desirable to clarify 
to a certain extent the theoretical framework underlying the 
assumptions on which the development of the new instru- 
ment was based. These assumptions will now be discussed. 


DEVELOPMENT OF THE INSTRUMENT 


Basic Assumptions 

The basic assumption of the new instrument to be de- 
scribed in the following pages is that it is possible to under- 
stand the nature of and degree to which the art experience 
of an individual is developed by ascertaining the degree to 
which he is able to see and appreciate significant similarities 
and differences in art objects. “The reaction of the artist is 
colored by all sorts of . . . associations and feeling, of which 
he is naturally unaware, but which affect profoundly the 
form taken by the work of art and which have the power to 
stir up corresponding . . . feelings in the spectator. It is the 

18 Tt is not implied that the artist tries to "please" the general public, 


but that his efforts are concentrated on organizing his creation in such a 
way that it may be suitable for conveying his esthetic message. 
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fact that the works of art act as a transmitting medium be- 
tween the artist's . . . nature and our own that gives it its 
peculiar, and as we may say ‘magic’ power over us. It is 
-magic because the effect on our feelings often transcends 
what we can explain by our conscious experience.”™ If the 
reactions of the artist, colored and conditioned by his per- 
sonal associations and feelings and embodied in his work of 
art, have stirred the spectator to corresponding—even though 
not necessarily identical—feelings, the artist (or actually the 
work of art) and the spectator may be said to be communi- 
cating with one another. This communication is possible if 
the spectator has been able to establish an esthetic field in- 
cluding himself and the art object. When this happens, we 
may say that he really is able to “appreciate” the work of art, 
that he is “sensitive” to its artistic qualities. 

The deeper the art experience of the subject is, the more 
he responds to the personality of the artist as revealed in the 
work of art, the specific way in which the artist rendered his 
subject matter, the cultural background of the work of art, 
the importance of the media chosen, the particular way they 
are used, etc. The quality of his art experience is developed 
to an even higher degree if he is responsive in this way to 
different works by the same artist, though the subject mat- 
ters and other more superficial qualities (such as the size of 
a picture) may differ from one work to the next. 

A first assumption, then, may be that art sensitivity is re- 
vealed by the degree to which a student responds to the 
visible similarities existing in certain works of art created by 
the same artist. As a matter of fact, the degree to which these 
similarities can be seen and the degree to which a subject 
can reasonably be expected to respond to the affinity existing 
between the objects created by one artist will depend on 
many factors. Some of these factors, such as the particular 


14 Fry, Roger, Art History as an Academic Study, p. 13 in his “Last 
Lectures.” 
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selection of works of art being viewed and the. context or 
conditions under which these works are seen, are of out- 
standing importance. Unless these are properly controlled, 
the assumption may become invalid. 

A second assumption is that the nature of a student’s art 
experience may be revealed by the kinds of similarities to 
which he is or is not responsive. He may be responsive to the 
similarities existing between works of art seen as wholes, to 
the affinity mentioned before, or he may be responsive only, 
or chiefly, to similarities in color, mood, or spatial arrange- 
ment. If enough opportunities are given to a student to select 
similarities, his pattern of reaction may be open to examina- 
tion. This may also be said to be true in a negative sense; 
that is, it may be characteristic of a student not to see, or to 
be unresponsive to certain kinds of similarities. 

A third assumption is that a student whose appreciation is 
well developed will have a certain definite emotional reac- 
tion to art objects. He will like works of art which make use 
of the qualities he is responsive to; he will dislike art objects 
which make use of qualities that do not appeal to him. He 
will neither like nor dislike art objects which “leave him 
cold,” which “do not convey any meaning,” i.e., art objects 
which seem uninteresting either way. 


Construction of the Instrument 

The construction of the instrument, “Finding Pairs of Pic- 
tures,” was based largely upon the three assumptions dis- 
cussed above, The instrument had to provide evidence as to 
the degree to which, and the way in which, students respond 
to the affinities existing between works of art; it had to pro- 
vide evidence concerning the kinds of similarities to which 
they are responsive or unresponsive; and it had to reveal the 
art objects, or qualities of art objects, to which they have a 
definite emotional reaction. 

According to the first and second assumption, it is possible 
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to understand the nature and degree of art experience of an 
individual by ascertaining the degree to which he is able to 
see and appreciate important similarities and differences in 
art objects. It was thought that this might be tested most 
appropriately by presenting students with examples of art 
objects and asking them to pair them, and then examining 
the results to see what inferences might be drawn. The third 
assumption—that students will have an emotional reaction to 
art objects which make use of qualities to which they are 
responsive—could be tested by asking the students to select 
certain examples which they liked or disliked for certain 
reasons, and examining these choices to see whether or not 
they corroborated hypotheses raised by the examination of 
the pairings. 

In constructing the instrument it was impossible to present 
a great variety of art objects at one time and. hence for prac- 
tical reasons a restriction to one field of the visual arts was 
necessary. A decision was made to begin with the construc- 
tion of a test covering the field of painting. This field was 
selected for two reasons: (1) it is more complex than some 
of the minor arts, and (2) students are usually more familiar 
with it than with sculpture, architecture, or with the minor 
arts. There is also the possibility that the response to certain 
subtle values in paintings may be a valid indication of 
esthetic response to the same values when they appear in 
other fields of the visual arts. For instance, one would expect 
a person whose response to color combinations in paintings 
is well developed to be able to apply the same discrimination 
in dealing with textiles, etc. This will have to be tested in 
future studies, however.!* 

The next problem after limiting the field to that of paint- 


15 It is realized that ideally an evaluation of art experiences should cover 
all the fields of the visual arts, and it is thought that tests based on similar 
principles but covering other areas, such as sculpture, architecture, and the 
minor arts, can and should be developed. 
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ing was that of setting up criteria for the selection of the 
paintings to be used for the pairing. 

In the first place, the pictures had to be selected in such a 
way that there would be an optimum chance for creating an 
esthetic mood. It was desirable that everything endangering 
this mood should be avoided as far as possible. It was neces- 
sary to exclude pictures evoking too strong effects and pic- 
tures evoking extra-esthetic deliberations, if not very special 
reasons recommended using them. Thus, it was decided that 
certain subject-matter fields could not be used because they 
dominated the students’ interest too strongly. For instance, a 
picture such as “Washington Crossing the Delaware” could 
not be used because primarily it evokes patriotic feelings or 
historical deliberations, rather than “purely esthetic” feel- 
ings. Because it was found in preliminary studies that some 
students have difficulty in pairing pictures from widely dif- 
ferent subject-matter fields, it was thought desirable to limit 
the subject-matter somewhat in order to provide a maximum 
Opportunity for pairing. 

It was also felt that students brought up in the tradition 
of appreciation for the old masters and students whose main 
interest is concentrated on modern art should, in taking the 
test, have about the same opportunities to reveal sensitivity 
to art values. Therefore, it was necessary to exercise care in 
order that the selection not be dominated by one group or 
the other, , 

Most important of all, however, was the selection of ex- 
amples which could be legitimately paired; that is, examples 
containing affinities which can be recognized by students. It 
was realized that the similarities between the paintings of a 
single artist may not always be greater than the similarity of 
certain elements of one of his paintings to the same elements 
ina painting by another artist. Care had to be exercised to 
remove as many of these potential sources of confusion as 
possible. To assure this point, it was decided that the selec- 
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tion of paintings to be paired should be made on a strictly 
empirical basis. In line with this, a series of experiments was 
made with a group of 60 high school students who were 
chiefly in the ninth, tenth, and eleventh grades. Several hun- 
dred reproductions of paintings were presented to them in 
groups of about 40 paintings, and a careful record of their 
responses was kept. Pictures which were not used at all by 
these students for pairing were discarded at once. Pictures 
which were paired with pictures by other artists in more than 
25 per cent of the total number of times they were used were 
also excluded from further experiments. The remainder of 
the pictures which had been paired with those by another 
artist were dealt with in a manner which can be best de- 
scribed by giving an example of what actually happened. 

One group of paintings presented to the students of the 
experimental group contained among other pictures two 
paintings by Picasso, “The Absinth-drinker” and “The Gui- 
tarist”; several paintings by El Greco, among them the “View 
of Toledo”; and several paintings by Corot, among them 
“Paysage.” 

In more than 25 per cent of the times any one of the two 
paintings by Picasso was used for the purpose of pairing, it 
was paired with the other painting by Picasso. Therefore, 
the experiments with these two Picassos were continued. 
Suppose one student paired “The Absinth-drinker” by Picasso 
with the “View of Toledo,” while another student paired the 
same picture with the “Paysage” by Corot. This suggested 
that in a complicated situation, when many elements from 
which to choose are offered, it is difficult for some students 
to respond to the affinity existing between these two Picassos. 
Therefore, a less complicated experimental situation was set 
up. To students four pictures were presented, the two 
Picassos, the Corot, and the El Greco, and they were asked 
to find the picture closest to "The Absinth-drinker." In other 
words, this time they did not have to select one out of 39 
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pictures, but one out of three. Unless at least 90 per cent of 
the students of the group selected the other painting by 
Picasso as the best choice, the painting “The Absinth-drinker^ 
would have been excluded from future experiments. 

This procedure was followed with all paintings for which 
some doubt existed about whether or not they ought to be 
excluded from the test. The purpose of this procedure was to 
make sure that the pre-supposed affinity existing between 
paintings by the same artist actually exists for students of this 
age level and cultural background. By selecting the sample 
in this way it was hoped that as far as possible no standards 
would be imposed on the students which might be outside of 
their particular experience or alien to the orbit of their 
esthetic perception. 

In selecting the material for the instrument, then, the sam- 
ples were restricted to the field of painting; pictures were 
selected in such a way as to provide an optimum chance for 
creating an esthetic mood; pictures which might prove too 
distracting were avoided; examples were restricted to a few 
subject-matter fields; care was taken to provide examples of 
the works of old and modern masters; and as far as possible 
only those pictures by any one artist were chosen which, ac- 
cording to preliminary experiments, had similarities which 
students are able to recognize as such. 


DESCRIPTION OF THE TEST 


As finally developed, the instrument consists of a picture 
sheet, a set of instructions to the student, and an answer 
sheet. 


The Picture Sheet 

The picture sheet consists of a piece of cardboard, approxi- 
mately 24” x 44” in size, on which 40 colored postcards are 
mounted. These are copies of more or less well-known paint- 
Ings ranging in periods represented from the Italian and 
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German Renaissance to modern and contemporary art. 
Dutch, Spanish XVIIth century, and French XIXth century 
paintings are included. Portraits, landscapes, and still-lifes 
are represented. The copies used are of the best available 
quality, chiefly Jaffe prints, and they have been arranged on 
the cardboard in such a way that the whole set makes in 
general a pleasant appeal. Particular effort has been made 
to avoid having one painting interfere with the appreciation 
of another next to it. No titles or names of artists are given, 


but each painting is marked with a number for identifica- 
tion.” 


The Instructions 


The instructions presented to the students are so stated as 
to reassure them that the test is not based on any particular 
notions about art or painting, periods or painters. They are 
told that it is not expected that the art appreciation of an 
individual ought to conform to any fixed standards. Efforts 
are made to convince them that art appreciation is some- 
thing very personal, different from one person to the next. 
Therefore it is carefully pointed out that there are no “right” 
or “wrong” ways of going about taking the test. 

Deliberate efforts are made to avoid as far as possible re- 
strictions which might limit the response, or create an at- 
mosphere of examination. Thus students are told that no time 
limit is set, and, even though according to experience the 
student's ability to find pairs is usually exhausted after about 
45 minutes, it is recommended that teachers allow students 
to use as much time as they wish in taking the test. 

Other limitations of the response would be to ask the stu- 
dents to use every picture, or to find a prescribed number of 
pairs. In order to avoid this type of restriction it is pointed 
out to the students that they are not required to use every 
one of the pictures, that they may use one picture several 


t° For the list of paintings used see the Appendix. 
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times for the purpose of pairing, another one not at all. A 
certain freedom is given to the students in determining the 
number of responses they wish to make. During the prelim- 
inary experiments the students were not told to select any 
particular number of pairs; nevertheless the great majority 
selected between 20 and 30 pairs. Experience to date has 
shown that nearly all students are able to find about 20 pairs 
and that after about 23 pairs most of the students stop work- 
ing. On the basis of this experience, the students are asked to 
find, if possible, at least 20 pairs, but not more than 30 pairs. 

The instructions suggest the selection of pairs of pictures 
Which have important artistic features in common. As exam- 
ples of such features style of painting, use of colors, design, 
mood, the way in which objects are painted, are mentioned. 
Since experience demonstrated that most students show a 
tendency to rely too strongly in their pairing on the similarity 
of subject matter, they are warned that: "If a subject matter 
in two pictures is the same (such as flowers), but if each of 
them is painted in a different way, then this similarity of 
Subject matter does not seem to be an important reason for 
pairing them. It might be better to put one of these paintings 
of flowers together with a portrait, or a landscape in which 
the colors and the design, the style and the mood are very 
much like those used in the painting of flowers." 


The Answer Shect 

The students are asked to indicate their choices of pairs 
and their preferences and dislikes of pictures on an answer 
Sheet prepared for this purpose. The answer sheet consists 
of two parts and contains in its first part, in addition to the 
usual identifying data, spaces in which the students can indi- 
cate their selections of pairs by writing the numbers of the 
two paintings which according to their opinion have e pes 
tant artistic features in common. This part is arranged as 
follows: 
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1. No. and No. make a pair. 2. No. and 
No. make a pair, and so on up to 30. 


The second part of the answer sheet is arranged as fol- 
lows: 


Now that you have studied all of the pictures, give some 
general information as to your personal preferences and 
dislikes. 
1. Please give the numbers of 1, 2, or 3 pictures which you 
like best: 
The numbers of these pictures are 
Ilike these pictures best because 
2. The picture I like best for the mood is picture num- 


ber 

8. The picture I like best for the colors is picture num- 
ber 

4. Please give the numbers of 1, 2, or 3 pictures which you 
like least: 


The numbers of these pictures are ——__— 
I like these pictures least because —— — ___—_— 
5. The picture I like least for the mood is picture num- 


ber 
6. The picture I like least for the colors is picture num- 
ber 
Tue Test INTERPRETATION 
The Scoring 


The basis for the scoring is the number of pairs of pictures 
painted by the same artist which a student is able to find. 
Pairs of pictures painted by the same artist will, for con- 
venience, be called “S” pairs. The pictures used permit the 
selection of as many as 43 different “S” pairs. 

One of the “S” pairs consists, for instance, of the pictures 
No. 1 and No. 24 (see list of paintings in Appendix). Both 
are paintings by Picasso, painted in his so-called *plue' 
period. The color scheme used in both paintings is very sim- 
ilar, and no other painting is included in the test which has 
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an analogous color scheme. Both paintings are representative 
of a certain period of painting. The particular flow of lines, 
the sad mood expressed in them, the way in which the sub- 
ject is rendered, and many other features can be found only 
in these two paintings in the present set. If a student selects 
this pair we may assume that he is responsive to several of 
the artistic similarities these two paintings have in common, 
and, moreover, that he probably is responsive to the affinity 
existing between the two pictures as a whole. 

A copy of a score sheet is reproduced in the Appendix. 
Three scores are given in per cents—the “S” pairs a student 
was able to find; the ratio of the number of “S” pairs to the 
number of attempts; and the number of artists, expressed as a 
per cent of the total number of artists, whose paintings the 
student was able to pair in an “S” way. 

Of these three elements the most important and most in- 
formative is the second. The first score obviously is condi- 
tioned by the willingness of a student to select many pairs; 
by pure chance a student who selects 30 pairs ought to find 
more “S” pairs than one who selects only 20 pairs. Therefore, 
the per cent of “S” pairs has to be interpreted in the light of 
the number of attempts the student made; this is facilitated 
by the second score. The score on number of “S” pairs is 
recorded because if two students, for instance, have about 
the same score in “Ratio,” the one with the higher score in 
“S” pairs obviously has given a better performance. 

The score on “Number of artists” is mainly of descriptive 
character and may be used for the purpose of ranking stu- 

ents only if the score on “Ratio” as well as on “S” pairs is 
nearly the same for two students. Actually this score is sep- 
arated into subscores and the.record on the right side of the 
Score sheet indicates those artists whose paintings a student 
Was able to pair in an "S" way. 

If a student paired only or primarily old masters in an “S” 
Way, one may infer that this is the realm of his main interests. 
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If a student found a great many “S” pairs by using only 
paintings by one or two masters it might be that he is for 
one reason or another very well acquainted with just these 
paintings, and it may be inferred that the range of his under- 
standing is smaller than is indicated by the score on “S” pairs 
and “Ratio.” The statements regarding preferences and dis- 
likes are not recorded on the score sheet because so far no 
way of treating them numerically has been found. 

The scores indicate to what degree the student's apprecia- 
tion of the 40 paintings included in the test is developed as 
compared with other members of his group. They indicate 
roughly whether his appreciation of modern or old masters, 
of portraits or still-lifes, is developed to about the same de- 
gree, or is unevenly developed in any one of these areas. By 
means of the scores alone it is not possible to ascertain 
whether a student has native artistic ability or only an intel- 
lectual understanding of the field. A high score may be due 
to native ability or it may be due to the special background 
of the student. Familiarity with art, frequent visits to mu- 
seums, and the like, influence the score in the same way as 
creative work in the arts or native abilities influence them. 
Nevertheless, the rough score seems to indicate fairly ac- 
curately where a student stands within his group with respect 
to the degree to which his art experience is developed. If one 
wishes to know more about a student, his individual re- 
sponses must be examined, since the answer sheet furnishes 
information which is not reported on the score sheet. The 
method of obtaining this is to make an interpretation of the 
data recorded on the answer sheet. 

The main assumption underlying this interpretation is: 
everything that the subject does is important and he does not 
do anything without valid reasons. The basis for a given re- 
action of a student may or may not be a genuine esthetic 
response to an art experience; nevertheless, in interpreting 
the results of the test, one ought to be able to answer certain 
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questions. For example: What were the main artistic features 
to which the student responded? What are the artistic fea- 
tures to which he seems to be unresponsive? What might 
have been the reasons preventing him from making an 
esthetic response to the art objects presented to him? Or, 
approaching it in another way, one might ask what might be 
the reasons within a student's personality which made him 
respond to certain works of art, or particular art elements, 
and not to others. To answer these questions, the study of 
pairs consisting of two pictures painted by different artists is 
as important as the study of the so-called “S” pairs. The 
former pairs may be called “D” pairs. 

А “D” pair which is occasionally selected by some students 
consists of No. 1 and No. 35. Both paintings make use of 
greenish colors, but their use, the way they are blended, and 
their meaning within the context of the painting is quite dif- 
ferent in these two paintings. The mood expressed in both 
paintings is of a more or less introspective quality, enforced 
by the cold colors in which both are painted. The quality of 
this introspectivity is different, however. The mood of No. 1 
may be described as being sad and withdrawn, whereas the 
mood of No. 35 is one of religious exaltedness. The style in 
which these pictures are painted is different, but there may 
still be recognized in both a common “Spanish” element. 
The selection of this “D” pair may be accepted as indicating 
that the subject who selected it was responsive to the general 
color used in these paintings, even though he was not respon- 
Sive to the different ways in which these greenish colors are 
blended. He probably was responsive to the general mood of 
introspectivity permeating both pictures, without being re- 
Sponsive to the important difference in mood which can be 
recognized. The student may have been reponsive to the 
“Spanish” element common to Nos. 1 and 85 without being 
responsive оће difference in the style. 

Many more inferences pertinent to the student’s art experi- 
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ence and in this way pertinent to his response to art as well 
as to his personality might be drawn from the fact that he 
selected this particular pair. Great caution has to be exercised 
not to consider valid an inference based on the study of any 
one pair. The selection of any particular pair can have a 
quite different meaning when occurring in different con- 
texts. Any one response to this instrument has to be inter- 
preted in the light of the possible meaning of all other evi- 
dence which can be obtained through a study of all of the 
responses of the subject to the test. In this connection, as has 
been mentioned before, not only what a student does is of 
importance, but also what he avoided, or missed doing, has 
significance. Pairs he selected not only have to be studied in 
the context of all other pairs, but they have to be studied in 
their sequence, and in the light of the pairs the student failed 
to select. First we have to consider which are the pictures he 
likes and dislikes; these data in turn will shed light on the 
pairs selected because students tend, in their pairings, to 
make different uses of preferred and of disliked pictures. 

When the present study of this instrument is concluded, 
all pairs which have been used to a considerable extent and 
which seem to be significant either for the art experience or 
the personality of a student, will be listed, each with the in- 
ferences which suggest themselves in connection with the 
use or non-use of the pair. Once this list is available the 
interpreter will have to integrate into a consistent picture 
the meaning of the pairs which a student has selected plus 
the meaning of the non-use by this student of pairs com- 
monly used. This integration will have to be achieved 
through considerations of the meaning of the preference OY 
the dislike of any one of the 40 pictures. 

This task will be less difficult than it appears, because We 
can restrict the investigation of the student's responses to the 
areas in which he differs from the group. The-“Ratio” score 
which a student obtains places him in a certain section of his 
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group. The importance for the interpretation of his selection 
of any one pair depends upon the extent to which it is similar 
in difficulty to the other pairs which he has used. An example 
might clarify this somewhat. If, for instance, a student who 
is in the lowest quarter of his class in his "Ratio" score selects 
an "S" pair which has been found by only one or two other 
students who are among those receiving the highest scores 
on "Ratio," this pair becomes very significant for the inter- 
pretation. It becomes significant because one would expect a 
student with a low “Ratio” score to be able to find only the 
most obvious pairs, that is, only the pairs which have also 
been selected by a large portion of the group. The opposite 
is also true—if a student is in the highest quarter of the class 
in his score on “Ratio,” and one finds that there are pairs 
selected by a large portion of the group which he has missed 
or avoided using, these pairs become significant for the inter- 
pretation. 

It is evident that a student's responses must always be 
examined against the background of the group and the way 
in which the members of the group have reacted to the test 
problems. This is not only true of the particular group in 
Which the student is working but it is also true of large age, 
Sex, and cultural groups. The study of these larger group 
differences will provide important material for future inves- 
tigations. 

As a basis for the test interpretation, the following informa- 


tion is therefore needed: 
" 


l. An analysis of how often any pair has been used by 
the other members of the group. This analysis will 
make possible a decision as to the degree of signifi- 
cance which might be attached to the selection of a 
pair. The kind of inferences which can be drawn if 
a pair has been selected has been indicated on pages 
293-996. Here we may add some of the inferences 
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which might be drawn if a student does not select a 
particular pair. The “D” pair mentioned above con- 
sisting of Nos. 1 and 35 may again be used as an 
example. Assuming that this "D" pair has been com- 
monly used by the other members of the group, some 
of the reasons for the avoidance of this pair might 
then be: a better developed discrimination for the 
importance of color shades, for differences in style, 
and a lesser degree of responsiveness to іпітоѕрес- 
tivity. 


. Knowledge of the average number of times any single 


one of the pictures has been used. Continuing the 
example, we should like to know whether the student 
used pictures expressing an introspective mood less 
often than the average. If that is true, then the avoid- 
ance of Pair 1-85 might not be due to a higher dis- 
crimination, but may be due to a lack of interest in 
paintings expressing an introspective mood. In this 
connection it may be added that the use of a pic- 
ture more often than the group average usually indi- 
cates that the student’s interest centers around this 
picture. This interest need not always be of a positive 
nature. A repetition of a pair may also indicate a соп” 
centration of interest. 


. A comparison of the preferences or dislikes of a stu- 


dent with the preferences and dislikes of his fellow 
students. In continuation of the example mentione 

above, we may say that if this student states that he 
prefers introspective pictures, or pictures making us? 
of dark, greenish, or cold colors, we can be reason” 
ably sure that he avoided the selection of Pair 1-35 
for esthetic reasons. On the other hand, if he dis- 
likes this type of painting, or is not at all intereste 

in it, the avoidance of Pair 1-35 becomes less impo" 
tant as far as the evaluation of his discrimination 107 


APPRAISING STUDENT PROGRESS 296 


artistic values is concerned. As another indication of 
his avoidance of introspective tendencies it will still 
be important for evaluating his personality. 

4. A study of the sequence of pairs. Study of the mean- 
ing of the sequence of pairings has been very fruit- 
ful. For purposes of illustration of the kinds of in- 
sights this permits, the following illustration may be 
given. Certain very obvious pairs tend to appear in 
the very beginning of the test. A student who begins 
with seldom used pairs seems to be one whose art 
experience is different from that of others in the 
group. To begin by indicating pairs consisting of 
portraits is usual. To begin with a pair consisting of 
still-lifes suggests either a person very much inter- 
ested in this subject matter or a student who is re- 
served at first in establishing positive relations with 
his fellow men, or both. 


Tt can be seen that the interpretations would be greatly fa- 
cilitated if they could be made on the basis of a fairly large 
collection of data on the way in which members of different 
groups respond—the ways in which they pair the pictures; 
the pictures they like and dislike; and the sequences in 
Which pictures are used. Thus far it has not been found pos- 
sible to achieve this. 


How TO ADMINISTER THE TEST 


In accordance with our general conception of art experi: 
ence, it is important that a spirit of freedom prevail during 
the time the test is taken in order that an esthetic mood may 
be created and maintained. It is best to have every student 
Work with a separate picture sheet. However, two or three 
students may work together on one picture sheet. Although 
care should be taken that they do not unduly influence one 
another, nevertheless explicit prohibitions not to discuss the 
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test should be avoided. Some free discussion makes the se- 
lection of pairs much more interesting. Anything that can 
contribute to the students’ feeling at ease should be done. 
Thus, they should be allowed to stand up and move around 
so that they may see the pictures better, ete. 


RELIABILITY AND VALIDITY 
Reliability 

On a priori grounds it seems reasonable to believe that it 
is more difficult to secure an adequate sample of pictorial 
art (a field in which reactions may be strongly influenced by 
the emotions) than it is to achieve an adequate sampling of 
information within a restricted subject-matter area. Because 
sampling affects reliability, a reliability coefficient which 
would not be considered very high for an information test 
may be the highest reliability coefficient which can be ex- 
pected on an art test of the type described. 

Meier and Seashore, for example, state that “with tests 
based upon concrete learning accomplishment a higher reli- 
ability is expected than one testing complex mental functions. 
With the latter kind, a coefficient of reliability of .80 is 
regarded as about as high as can reasonably be expected, 
because of the uncertainty of knowing exactly what factors 
operate in the person’s total reaction. With a test of capacity 
the opportunity for chance factors to control the final result 


is increased, hence a somewhat greater allowance must be 
made for them.”?7 


' Two reliability studies of the instrument under discussion 
here were made, the first based on the split-half method, the 
second ona comparable test form. The reliability coefficients 
estimated by correlating the halves of the test and applying 
the Spearman-Brown prophecy formula, based on the test 
results of 145 twelfth-grade high school girls and boys, are 
as follows: for the scores on “S” pairs, the coefficient is 0.57; 
7* Art Judgment Test, Examiners Manual, р. 21. 
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for “Number of artists,” it is 0.58. Since the ratio score for 
each half cannot be added to get the “Ratio” score for the 
entire test, we cannot give a statistical estimation of the 
reliability of the ratio score based on the split-half method. 
Therefore, a somewhat comparable form consisting of 49 
other paintings in place of the original 40 paintings was 
developed. These 49 paintings are not as well known as the 
ones used in the original form, but they cover the same 
periods. The students who took both tests were not 
homogeneous in their art experience. The group consisted of 
27 senior high school students and 38 college students. It 
may be expected that the results will be somewhat better if 
the experiment is repeated with a larger group. The second 
form was taken shortly after the first test was taken, either 
after a lapse of several hours, or within one or two days fol- 
lowing. The reliabilitv coefficients based on the intercorrela- 
tions of the two forms are 0.58 for “S” pairs, 0.77 for “Ratio,” 
0.54 for “Number of Artists.” 

As has been mentioned before, the most important score 
is the one on “Ratio.” According to the directions of the test, 
which give great liberty to the students in selecting many 
or few pairs, and in using the paintings of many or few 
artists, it was not to be expected that the reliability coefficient 
of the scores on “S” pairs and on “Number of Artists” would 
be very high. 

Validity 

Validity studies are still in progress. Such evidence of 
validity as has been collected will be presented here, with 
the reservations which must accompany data which are in- 
complete. It was thought that the validity of this test might 
. 75 For a list of these paintings use the Appendix. Some of the pictures 
in the comparable form furnished such interesting and important informa- 
tion that they ought to be included in a future form of the test in piace 


of Some of the pictures originally used which were less successful in yield- 
ing information. 
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be established if the following assumptions can be substan- 
tiated by the evidence: 


1. 


The test measures some ability which is not a func- 
tion of the particular pictures used in the test. The 
correlation between two tests which use quite differ- 
ent pictures would seem to indicate that the test is 
measuring an ability which is not dependent upon 
the particular pictures which make up the test, but an 
ability which does operate within a wide variety of 
pictures. 


. Subjects are responsive to one or more of the basic 


qualities of art, but are responsive in different de- 
grees. This would seem to be supported by the fact 
that the lowest score on number of “S” pairs which 
any student made is higher than a chance score would 
be, and there is a considerable range—from 17 per 
cent to 100 per cent—in the scores of the subjects." 
The development of visual sensitivity or of art abili- 
ties need not correspond to the development of intel- 
lectual abilities as measured by the usual intelligence 
tests. In the case of one school, the results of this test 
were compared with the results of intelligence tests 
giving a correlation coefficient of approximately zero. 
Since art is something which can be taught, at least 
up to a certain degree, the general level of a group 
of art students ought to be higher than the level of 
a comparable group without art training. The median 
score of a group of art students has been found to be 
higher than the median score of any unselected group- 
The groups are small, however, and no controlled ex- 
periments have been set up to indicate whether or 
not a further selective factor of ability or interest has 


1 See table on p. 304. By mere chance a student might be expected to 
get a score of 5 per cent or less. 
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been operating to produce the results which we have 
at present. It would be desirable to repeat this study 
with larger groups of students and to compare the re- 
sults with the results of control groups. (See table on 
page 304.) 

5. Students with native ability should give a better per- 
formance on the test than students without native 
ability. Within a group without art training, there- 
fore, there should be some students who, due to their 
native ability, perform as well as students with art 
training. This would seem to be corroborated by the 
fact that in the eighth grade the highest score made 
by any student is as high as the lowest score made 
by any student in the master class in painting in an 
art academy. This latter group is composed of stu- 
dents who intend to become professional artists. As 
can be seen in the table on the next page there is 
considerable overlapping in the ranges of scores of 
different groups. The weight which each factor, abil- 
ity and training, contributes to the scores will have 
to be determined by a controlled experiment. 

6. The student reveals the nature of his appreciation of 
art and some elements of his personality structure by 
his choices of pairs and by his preferences for pic- 
tures. Evidence for this assumption is encouraging 
though not conclusive. Unfortunately, many of the 
evaluations of the interpretations have been made in, 
verbal rather than numerical form. It is impossible 
at this point to print them in full, or to ascribe 
numerical values to these evaluations." In four 
Schools, however, teachers were asked to select a 
number of students with whom they were very 
familiar. The test results of these students were in- 
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terpreted and the teachers rated these interpretations 
on a five-point scale ranging from “very good” to 
“poor.” The intermediate ratings were classified as 
“generally accurate,” “possibly accurate, but insig- 
nificant,” and “of doubtful value.” 


In one school the teachers selected 17 students. Of the 
interpretations of the test results of these students, nine re- 
ceived the highest rating— "very good.” Four descriptions 
received the next rating— generally accurate," and four the 
middle rating—"possibly accurate, but insignificant." No 
cases were placed in the "of doubtful value" or "poor" col- 
umns. The teachers were also asked to indicate any "gross 
inconsistencies or errors." They found none, and stated fur- 
ther that in no instance was there failure to designate at 
least one important characteristic of the student?! 

In another school the descriptions were not only rated on 
the five-point scale. For some descriptions the teachers used 
ratings composed of two of the five points of the scale, in- 
dicating in this way that one part of the description seemed 
to deserve one rating, another part of it another rating. 
Thirty-three students were described; of these descriptions 
16 were rated as "very good," four as partly "very good" and 
partly “generally accurate.” Two were rated as “generally ac- 
Curate,” one as partly "very good," partly "possibly accurate, 
but insignificant." Three were rated as partly "very good" 
and partly “poor,” two as “of doubtful value,” none as “poor.” 
Five descriptions were rated with different combinations of 
the five values of the rating scale? 

The test results of 27 students of an art academy were in- 
terpreted and the faculty of the department of painting was 
asked to rate these interpretations on the same five-point 
2.5 study was conducted at George School Bucks County, Penn- 


“This study was conducted at the Cambridge School, Cambridge, 
Massachusetts,” 
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scale. Eighteen cases received the highest rating, six the next, 
none the middle rating, one was rated of doubtful value, and 
two received the lowest rating.” 

Finally, 31 students of a teacher-training institution were 
tested and the results interpreted. The descriptions of these 
students were rated by the art faculty, and 22 descriptions 
were rated as "very good," six as "generally accurate," two 
as "possibly accurate but insignificant,” one as “of doubtful 
value," none received the lowest rating. It was added that 
"almost without exception the essential qualities of the stu- 
dents" were "clearly" mentioned in the descriptions.?* 

The validity studies conducted at these four institutions 
are summarized in the table on page 584. According to this 
table, approximately 60 per cent of the descriptions were 
rated "very good,” and approximately 81 per cent of the 
descriptions were considered as being satisfactory (either 
very good, or generally accurate, or of an intermediate value 
between these two). Approximately 10 per cent of the de- 
scriptions were considered as being unsatisfactory (either 
of doubtful value, or poor, or intermediate values between 
these two). Only 2 per cent of the descriptions were rated 
as being definitely of poor quality. It is hoped that this dis- 
cussion will indicate the direction of the work on validity, 
both past and future, and the extent to which the evidence, 
however meager, supports the original assumptions. 


FUTURE USE or THE TEST 


The study of this test has not matured to a point where 
it is possible to present scientifically dependable conclusions 
about how such an instrument can be used most efficiently. 
However, it does seem that the instrument may be used for 


the purpose of counseling in so far as it may be possible to 


* This study was conducted at Cranbrook Academy of Art, Bloomfield 
Hills, Michigan. E н 
2t This study was conducted at State 


Teachers Coll ilwaukee, 
Sona, eachers College, Milwaul 
К 
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decide where a particular student stands when compared 
with his peers as far as his response to paintings are con- 
cerned. Moreover, it may be possible to use this instrument 
to ascertain changes in the performances of students, or 
groups of students, after they have taken art courses. For 
these two purposes the scores seem to furnish valuable 
evidence. 

This instrument can be used much more efficiently if in- 
dividual interpretations are made. When this is done, it 
seems possible, by means of the instrument, to get evidence 
about the specific art abilities of a student as well as about 
some of the features of his personality. It will be possible to 
discover sonie of the areas where he needs special help. By 
repetition of the test, it will be possible to discover the areas 
in which he has changed and those in which he remained 
on the same level as before. 

Finally, even at the present stage of development, the test 
furnishes some insights regarding the way in which art ex- 
perience is tied up with personality structure. More extended 
studies will enlarge our understanding of important art- 
psychological questions, such as the ways in which art ex- 
perience varies with different age and sex groups, different 
cultural groups, and groups from different socio-economic 
levels. Information as to the particular way in which the in- 
dividual experiences are combined with information about 
the differences in the reactions of different groups should 
have implications for the teaching of art. » 


OTHER INSTRUMENTS 


Several other instruments to reveal the ways in which 
students respond to art experiences were developed experi- 
mentally but were not studied as carefully as the one just 
described. One of these was called Seven Modern Paintings 
(Form 3.9). A committee of art teachers selected seven ex- 
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cellent large framed reproductions in color of modern paint- 
ings, not too well known to students (a Cezanne, a Van 
Gogh, a Picasso, a George Grosz, a Eugene Speicher, a 
Maurice Sterne, and an Alexander Brook). These were hung 
for at least a week at a time in six schools, without com- 
ment by teachers, allowing time for all interested students 
to become thoroughly familiar with the paintings. Then the 
art teachers in these schools asked all students, or a repre- 
sentative cross-section of students, to write any comments 
they cared to make about any or all of the paintings. The 
students were asked not to sign their names, but only to 
indicate their sex and grade in school, with the understand- 
ing that no attempt would be made to identify any comment. 
No directions were given except that they were not expected 
to write anything very profound or very clever, but to tell 
simply and honestly what they thought and felt about the 
paintings. In a few classes some of the more provocative 
comments were later read aloud, and more comments were 
collected during the ensuing discussion, About 12,000 com- 
ments were collected from about 1,000 students in grades 
five through twelve. А 

These comments were sorted until the following widely 
prevalent modes of response were discovered: 


І. Liking or disliking the paintings 

2. Liking or disliking the subject of the paintings 

3. Demands for photographic realism б 

4. Far-fetched interpretations of what the subject repre- 
sented or was doing: e.g., "The artist is trying to show 
how the wilderness is creeping in on the little house." 

. Fixed, dogmatic rules applied uncritically: e.g., “A por- 
trait should always have a dull, neutral background." 

Interpretations of the mood of the paintings: e.g, “The 

position of the body and the drab colors suggest sorrow 

and resignation." 


. A feeling of understanding, or not understanding, the 


-1 
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artists intention: e.g., “I don't see what he was driving 


£s 
at.” 

8. Comments indicating special sensitivity or insensitivity to 
color 


9. Comments indicating special sensitivity or insensitivity to 
design qualities other than color 


A few comments on each type about each painting were 
selected and mimeographed. Thereafter students were asked 
to indicate, while looking at these same reproductions, 
whether they agreed, disagreed, or were “neutral” with re- 
spect to each comment. An answer sheet adapted for ma- 
chine scoring was used. The directions also indicated that 
if a comment were true, but stupid and irrelevant, one should 
mark it “disagree”; and if it were neither true nor false, or 
partly true and partly false, or meaningless, one should mark 
it “neutral.” The way in which the test was set up made pos- 
sible two more categories of responses which were helpful 
in interpreting other scores: 
10. Tendency to approve (to agree with favorable statements 
and to disagree with unfavorable statements ) 
ll. Tendency to be “neutral” (the percentage of all state- 
ments marked “neutral” ) 


No judgments by a jury were thus far involved except in 
classifying the statements as truly representing one category 
or another. For example, the statement "I don't know whether 
it is a successful pórtrait because I can't see enough of the 
Subject's face" was selected by the jury as representing a de- 
mand for photographic realism. No judgment at this point 
was involved as to whether the comment was good or bad: 
only whether it was an authentic demand for photographic 
realism, No comments were included on which 100 per cent 
of the jury of artist-teachers could not agree. This was pos- 
sible because there were 12,000 comments to choose from 
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and only 105 comments were used in the test (15 about 
each painting). 

In selecting comments revealing special sensitivity or in- 
sensitivity to color and to design, it was necessary to decide 
which comments showed sensitivity and which did not. In 
the “interpretations of the mood of the painting,” also, it was 
necessary to select comments which were obviously within, 
or very far beyond, the range of commonly acceptable in- 
terpretations. These judgments, however, were relatively 
easy to make, and 100 per cent agreement was secured. 

Although the committee originally intended to get away 
from the criterion of agreement with an adult jury as much 
as possible, it came to feel that it would be interesting to 
have the jury mark the comments, and to see to what extent 
children of various ages approached the jurys judgment. 
The jury was composed of practicing artists who were also 
teachers—people who were presumably sensitive to art 
qualities and getting a great deal of enjoyment and stimula- 
tion from good painting. It was felt that if children ap- 
proached the jury's way of thinking and feeling about these 
objects as they grew older, the chances were favorable that 
they were headed in the direction of greater “appreciation.” 
The committee had become diffident about using the term 
“appreciation,” however, so they did not apply it to the per- 
centage of agreement with the jury. They were not sure that 
the jury was “right,” but believed it was reasonably mature 
as to judgment. They therefore called this score “general 
maturity of response.” This score is not to be taken too seri- 
ously. For example, 100 per cent agreement with the jury 
would probably be undesirable, since it would eliminate that 
individual idiosyncrasy of judgment which seems to be char- 
acteristic of people who enjoy painting. It was felt, how- 
ever, that a gain from about 50 per cent agreement to 75 per 
cent agreement as the child grew older would probably be 
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desirable. Within these limits, therefore, another category of 
responses was created: 


12. General maturity of responses (agreement with the jury) 


The jury agreed almost unanimously in marking all of the 
statements except in the categories of “liking the paintings” 
and “liking the subjects of the paintings,” so these categories 
were eliminated from consideration in arriving at the “gen- 
eral maturity” score. It was apparent that two equally sensi- 
tive people could look at the same painting, and both appre- 
ciate it deeply, while one liked it and the other did not. 
Liking paintings was essential to appreciation, but liking any 
given painting was not. The same reasoning would hold with 
even greater force with respect to the subjects of the paint- 
ings. These two categories were included chiefly to discover 
how they would affect other scores. 

Many of these categories of responses are desirable in one 
period of artistic development and undesirable in another. 
“Demands for photographic realism,” for example, would 
have been accepted as desirable—as making for artistic 
progress—in the early Renaissance, and perhaps they may 
still be considered desirable at certain stages of adolescence. 
To make scores easier to interpret, however, it was conceded 
that art teachers of this generation generally regard demands 
for photographic realism as undesirable, so this category was 
Stated negatively in the summary sheets as “Avoids evaluat- 
mg in terms of photographic realism.” Thus a high score 
always calls attention to what most art teachers would re- 
gard as strength, and a low score to a weakness. 

This test has not yet been scientifically validated, since it 
was developed only recently and has not yet been given to 
enough students to justify a statistical report on validity and 
reliability. Early returns, however, are very promising; at 
least promising enough to justify further research along 
these lines. The test requires some sensitivity to the mean- 
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ing of words, but verbal difficulties are minimized in two 
ways. First, students do not have to verbalize their responses 
for themselves, but only to indicate whether they agree or 
disagree with a comment which has already been phrased 
for them. Second, the comments are in the language of other 
students who have been able to put their thoughts and feel- 
ings into words, so the student is not confronted with adult 
concepts in adult terminology. Comments were edited only 
enough to remove ambiguities. Nevertheless, low scores 
made by students who are known to be nonverbal should 
be taken with a grain of salt. 


Chapter V 
EVALUATION OF INTERESTS 


EERE KEKE KEKE EEE RUE IE IUE KEKE KEKE KE KEK EE 
INTRODUCTION 


The introduction to Chapter IV mentioned the close con- 
nection of interests with appreciations and the difficulty of 
distinguishing them in specific instances. Work in both areas 
Was initiated by a Committee on Interests and Appreciations, 
which was later divided into sub-groups when it became 
apparent that techniques for evaluating interests and appre- 
Clations would be sufficiently different to justify a division 
of labor. The sub-committees on appreciations developed in- 
struments, which were described in Chapter IV, to discover 
the ways in which students responded to literature and the 
arts. The sub-committees on interests developed instru- 
ments to discover and appraise interests revealed by choices 
of books, magazines, newspapers, radio, and motion pic- 
tures, and interests fostered by the various fields of study 
in school. i i 


ANALYSIS OF THE OBJECTIVE 


One of the first conclusions of the Committee on the Eval- 
uation of Interests was that interests may be regarded both 
85 means and as ends. When they are regarded as means, 
teachers try to discover activities in which pupils are already 
Interested, and to utilize such activities in teaching pupils 
Whatever they have to learn. They justify certain activities in 
the school program on the ground that they are similar or re- 
ated to activities in which pupils have expressed an inter- 
est. They guide pupils who have such interests into these 
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activities, and direct other pupils elsewhere. They try to in- 
clude in the program more activities in which pupils have 
manifested a lively interest. If little or no interest is ex- 
pressed in a given activity, it is regarded as not likely to 
promote learning. 

When interests are regarded as ends or objectives, how- 
ever, a different approach is indicated. Teachers have to de- 
cide in what areas of activity pupils need to develop inter- 
ests, and the character and direction of interests in these 
areas which promise most for individual happiness and the 
common welfare. They must then examine the evidence of 
interests already developed as critically as test scores in other 
areas of objectives, noting strengths and weaknesses, and 
changing the school program to build upon the strengths and 
remedy the weaknesses. For example, it is generally assumed 
that pupils should develop interests in one or more wisely 
selected fields of service to society, since a man who is in- 
terested in his work is usually a happier and better citizen 
than one who is not. If pupils, then, shortly before gradua- 
tion from high school, have not developed such interests, or 
if their interests lie in a few fields which are inappropriate 
to their talents and opportunities, the school has failed in 
one of its obligations toward them. The character and direc- 
tion of these vocational interests may also be examined. 
Pupils may be interested in a career primarily as an oppor- 
tunity to get rich at the expense of other people, to "get to 
the top" against ruthless competition, and to enjoy a Holly- 
wood conception of "success." Or they may be interested in 
a career primarily as a job that needs doing—as a part of à 
great cooperative endeavor to provide adequately for our 
common needs. The latter promises so much more for indi- 
vidual happiness and the common welfare than the former 
that it may be regarded as one of many criteria for judging 
vocational interests. In this same fashion all other areas 0 
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desirable interests may be examined for evidence of growth 
in the kinds of interests which the school is trying to foster. 

The Committee on the Evaluation of Interests accepted 
both ways of regarding interests as legitimate and necessary, 
but conceived its own primary function to be that of helping 
to evaluate interests as objectives—as outcomes rather than 
as starting-points of the educative process. One reason for 
this decision was that in the agreement with the colleges co- 
Operating in this Study, the schools promised to provide only 
three types of evidence as a basis for admission to college, 
and one of them was “evidence of well-defined, serious in- 
terests and purposes.” Another reason was that relatively 
little work had been done in evaluating interests as objec- 
tives. Most of the standardized techniques as well as in- 
formal school practices attempted to discover interests as 
Starting-points or clues in attaining other objectives; they did 
not evaluate the effectiveness of a school program in devel- 
Oping interests which were important for adolescent devel- 
pment and social progress. 

In the course of its work, the committee had to discover 
and overcome three difficulties which commonly deter the 
evaluation of interests as objectives, and which may ham- 
Per the work of similar committees in the future. One was 
the unconscious assumption that little can be done about 
interests, that anv interest is as good as any other if it is not 
Obviously criminal, and that having no interests in impor- 
tant areas of activity is at most a misfortune, not a serious 
handicap which should be remedied by the school. The com- 
mittee came to regard these assumptions as completely false. 
No one ever had an interest which was not learned, or picked 
"p in one way or another from the environment. Even if 
Something in the organism generates the interest, such as an 
Interest in food, the character and direction of the interest 
аге obviously a product of the environment. The Eskimo is 
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said to enjoy seal blubber and tallow candles, while we pre- 
fer beefsteak and potatoes. If all our present interests were 
acquired and are continually changing, new interests can 
also be acquired, and less promising interests can be changed 
for the better. A school program may be judged in part by 
the character, direction, and importance of the interests 
which it generates. 

A second factor deterring the evaluation of interests as 
objectives among progressive teachers was the common as- 
sociation of evaluation with penalties and failure. It is espe- 
cially obvious in this area that if pupils are given low marks 
or are penalized in any other way for not having interests 
which they ought to have, they will subsequently “fake” an 
interest in these areas, thus invalidating the tests without 
affecting their real interests. This consideration only points 
to the way in which almost all evaluation data should be 
used, but especially the data on interests. If serious defi- 
ciencies are revealed, the program should be changed to 
remedy them. It will do no good whatever to flunk the pupils 
who are deficient in these respects, nor even to criticize them. 
They need not even be told the judgment of the school in 
regard to their interests. That is primarily a matter to be dis- 
cussed in faculty meetings devoted to curriculum revision, 
and in case conferences devoted to pl 
of individual students. 

A third factor deterring the evaluation of interests as ob- 
jectives was the suspicion that people who set out to implant 
interests in the young have in mind only adult interests. This 
danger was recognized and guarded against in devising in- 
struments to discover interests which are desirable at the 
adolescent level. These may include some interests which 
would be inappropriate for adults; they may not include 
some interests which are indispensable for adults; and they 
may translate other adult interests into adolescent terms; 


anning the program 
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just as little children transform the adult interest in children 
into an interest in dolls. None of these considerations denies 
that there are areas and directions in which adolescents 
should develop interests. If they do not, then the school 
should do something to help them. 

This view of interests as school objectives, which gradu- 
ally evolved during the Eight-Year Study, rests upon three 
basic assumptions, all matters of common observation. The 
first is that people who have desirable interests in the major 
areas of life activities are obviously happier and better off 
than those who do not. If a man is not interested in his 
work, or if he is little interested in his home and family, he 
is so plainly miserable that the matter does not admit any 
philosophic uncertainty. Second, interests are the mainspring 
‘of the educational process. They practically determine what 
can be effectively learned. If schools, therefore, wish to de- 
velop competence in the major areas of living, they must first 
develop interests in those areas. Third, the common welfare 
depends upon the character and direction of the interests of 
all citizens, If these are narrow and selfish, or morbid and 
cruel, as in the later days of the Roman Empire, the quality 
9f the civilization obviously declines. These three assump- 
tions leave no choice but to find out what interests are de- 
Sirable, to foster them by every means consistent with our 

€mocratic traditions, and to ascertain at regular intervals 
which of them are developing satisfactorily, and which of 
them need renewed attention. е 
. The first principle which the committee followed in locat- 
Ing desirable interests was that some interests should be de- 
veloped in each major area of living. These may be classified 

Toadly as economic interests, civic interests, interests cen- 
tering in the home, and recreational interests. The first three 
areas were sainpled chiefly in the Interest Index which is 
described on pages 338-348, although many inferences as to 
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interests in these areas can be drawn from other instruments 
described in this report. Lack of civic interests, for example, 
was found to be reflected frequently in high scores on un- 
certainty and inconsistency on the Scale of Beliefs, and in 
great confusion of implications and values on the Social 
Problems test. These areas were also studied through many 
informal instruments devised for particular courses or situa- 
tions and not reported here, and through standardized tests 
available from other sources. Vocational interests, for exam- 
ple, were frequently sampled by the Strong Vocational In- 
terest Blank, by papers written in various courses, and by 
counseling conferences. Interests in these areas were also re- 
vealed by the instruments developed in the area broadly clas- 
sified as recreational interests. An interest in books, for ex- 
ample, would be classified as a recreational interest, but if 
a student read an unusual number of rather technical books 
about architecture, and if in the arts and crafts (also clas- 
sified as recreational) he devoted himself to drafting, to in- 
terior decoration, and to making models of houses, buildings, 
and communities, one might safely infer a vocational interest 
in architecture. Thus, all of the instruments on interests cut 
across the areas of activity in terms of which they are first 
classified. 

In the area broadly classified as recreational interests, the 
committee distinguished five sub-areas in which interests 
should be developed: interests in people, in sports and games» 
‘and in the arts and crafts (including fine and industrial arts, 
music, dancing, drama, movies, and radio programs ), in read- 
ing, and in science or scholarship—at this level, interests in 
the various school subjects: Interests in people were such a? 
important element in personal and social adjustment that an 
instrument. revealing these interests among others will be 
described in Chapter VI. The other "recreational" interes? 
were sampled by the instruments now to be described. 


APPRAISING STUDENT PROGRESS 319 


The Reading Record 


The character and direction of interests in reading which 
the committee regarded as most promising were the follow- 
ing: 

1. The reading should be abundant. 

2. The reading should be varied as to type and content. It 
should include, for example, both fiction and non-fiction; 
it should reflect a wide range of human experience, and 
deal with many subjects. 

8. The reading should be selective, showing some concen- 
tration of interest upon subjects or types of reading suited 
to the reader. 

4. The reading should be increasingly mature, gradually in- 
creasing in difficulty, complexity, and depth of insight. 


It was agreed that evidence of progress in these directions 
could be secured through a record of reading kept by stu- 
dents and summarized periodically in these terms. The com- 
mittee first tried out a very long and elaborate record of all 
reading done over a period of two weeks. This included as- 
Signed and unassigned reading in books, pamphlets, maga- 
zines, and newspapers, and asked all questions about it 
which any member of the committee thought would be 
helpful. Over 1,000 students entered their reading on this 
record every morning for two weeks. When the results were 
analyzed, it was agreed that in the future: 

l. The record should involve an irreducible minimum of time 
and effort lest distaste for reading should be engendered: 

2. The record should be filled out at stated intervals, usually 
once a week, in English classes. Leaving it to pupils to fill 
Out at their convenience usually resulted in incomplete 
records. 

8. Only voluntary reading should be recorded. Students oc- 
casionally had difficulty in distinguishing voluntary from 
required reading, especially when books were strongly 
Suggested by teachers, or when supplementary reading 
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was required but not in specified books or amounts: 
Teachers were to decide what reading might be regarded 
as voluntary, or as indicating individual preferences. 

4. The record of books read voluntarily should be kept 
throughout the academic year in order to get a large 
enough sample to provide safe inferences as to the direc- 
tion of reading habits and tastes. A reliable sample of 
magazine and newspaper reading, however, could be ob- 
tained through a check list or questionnaire, administered 
annually or semi-annually. 


The minimum record of books read voluntarily consisted, 
in most of the Thirty Schools, of notebook pages with spaces 
to record the author and title of cach book, the date on 
which it was finished, and a few comments. Some teachers 
asked also for the number of pages in order to secure a more 
precise measure of "abundant" reading than the number ot 
books. A few teachers provided a list of types of books; 
breaking up "fiction" (which constituted about 90 per cent 
of all voluntary reading) into a number of smaller categories 
such as school stories, adventure, mystery, love and romance; 
etc., and asked pupils to classify each book in terms of this 
list. Other teachers, who were especially interested in widen- 
ing horizons through reading, asked pupils to classify eac? 
book by the nationality and period of the author. and by the 
period and country with which the book dealt, This was 
done in very broad categories. Since most of the author 
ep were American or English, and most of the books 7 
E I Uu tap setting, poth authors m "C 
The periods of botl eee е. аст a 500- 
и таа. € classified as B.C., A.D.-1500, 15 th 

? her” (when the period dealt W! 


was not specified, or in the future). Most of the tallies 2% 


cumulated in the 2 "abs 1 
spaces marked “American” and "Engl? 


че nu to the present, and served to remind pupils of the 
vast expanse of space and time which they had not yet ё 
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plored in their reading. Finally, some teachers asked pupils 
to indicate how well they liked each book on a rough scale 
from 0, not at all, to 4, signifying boundless approbation. 

Since most of these items could be recorded by number, 
referring to an item in the summary sheet, some teachers used 
the following form, mimeographed on notebook pages or on 
index cards: 


Date Author 
Title 
AUTHOR: Place... Time _____ Ty pe of book 


SETTING: Place _____ Time —____ Enjoyment | —— — 


Comments: 


These teachers asked pupils to keep their own summary 
sheet up to date as they read. This was often set up in some- 
what the fashion shown on page 322. 

When this sort of summary was kept by pupils, as soon 
as they entered a book in their reading record, they put a 
tally on the summary sheet opposite the type of book and 
under the degree of their enjoyment. As these tallies accu- 
mulated, they presented a graphic summary of the pupil’s 
Teading development in at least three of the four directions 
which the committee regarded as important. The total num- 
ber of tallies indicated abundance of reading; their dispersal 
represented variety by types, periods, and places; and 
Concentration at particular points on the first gridiron, accom- 
Panied by high ratings on “enjoyment,” represented selectiv- 
tty, which then had to be considered in terms of its appro- 
Priateness to the reader. The first gridiron also gave a rough 
indication of increasing maturity of reading, for the types 
of fiction listed there ranged from juvenile to adult, and the 
amount of non-fiction read proved also to be a crude meas- 
ure of maturity, since so little of it was read by the younger 
Pupils. In the Second gridiron, almost any tallies outside the 
SPaces reserved for American and English authors from 1800 


TYPES 


ENJOYMENT 


Fiction 


. Children’s stories 
. Animal stories 
. School stories 
. Sports stories 
Adventure—Western...... 
Беа опе... socs vas 

- Success stories... 
. Humorous stories 


мю о OU dU о н 


- Detective-mystery-horror. . , 


| 
10. Love and romance 
11. Historical novels.......... 
12. Novels on social problems. . 
13. Tragic novels 


16. Biography, autobiography. . 
17. Books of plays | 
18. Books of poems 
19. Books of essays 


30. All other non-fiction, . 
Non-Fiction Totals 


GRAND TOTALS 


N 
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to the present represented a gain in maturity. These meas- 
ures of maturity, however, were too crude for the purposes 
of English teachers who wished to measure the effects of 
various experimental programs, so a more refined measure 
was developed. This measure takes a good deal of time and 
Some practice to use, so that it will probably be used chiefly 
in connection with experimental programs. 

This measure of maturity was based upon a study by Jean- 
ette H. Foster of the reading of 15,000 adults.' Her analysis 
Showed that the 250 authors of fiction most frequently read 
could be objectively classified in six different levels of matur- 
ity in terms of the average age, education, occupational level, 
and general reading habits of their readers. Her placement 
of these authors on the various levels of maturity coincided 
with the judgment of the committee, looking at the list from 
the standpoint of the sort of maturity in reading which they 
Wanted to develop. They therefore extended her list to in- 
clude approximately 1,000 authors of fiction most frequently 
read by their pupils, matching each author with the authors 
Whose maturity level had been determined objectively.? 

At the same time they made a detailed classification of 
types of fiction and classified the works of each author in 
terms of this list. Authors typical of each of the six levels of 
Maturity, from 1 (very easy reading) to 6 (very difficult 
reading), and of various types of fiction may be found in 
the following sample: 

3 * Jeanette H. Foster, “An Approach to Fiction through the Characteris- 
Чез of Its Readers,” Library Quarterly (April, 1936), pp. 124-174, 

? The committee responsible for the extension was composed of Harold 
Anderson, University of Chicago High School; Irvin C. Poley, Germantown 
Friends School; B. 7. R. Stolper, Lincoln School; Ruth M. Ersted, Super- 
Visor of School Libraries in Minnesota; Jennie Flexner, New York City 
Public Library; Jeanette Foster, Hollins College, Hollins, Virginia; and 
Douglas Waples, Graduate Library School, University of Chicago. Douglas 

aples served ag a consultant on research in reading to the Committee on 


the Evaluation of Reading Interests throughout its work and took major 
"esponsibility for the development of the maturity scale. 


S 
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Author Type Maturity Level 
Altsheler, Joseph A. Setting b 
Austen, Jane Character 6 
Bacheller, Irving Historical @ 
Barrie, James Character, Romance 4 
Bennett, Arnold Character 5 
Boyd, James Historical 4 
Brush, Katharine Character 8 
Connolly, J. B. Adventure 3 
Conrad, Joseph Adventure, Psychological 6 
Curwood, James O. Adventure T 
Dell, Ethel M. Romance 1 
Douglas, Lloyd Philosophical 2 


This list provided at least a standard, uniform, agreed- 
upon classification of fiction by type and maturity so that 
teachers in different schools could compare the results of 
their reading programs. These were summarized by teachers 
in a new gridiron, with types of fiction at the left and col- 
umns for the six maturity levels, unclassified, and totals for 
each type. Until the list became familiar, each book recorded 
by a pupil had to be found in the list and tallied in accord- 
ance with the type and maturity level there assigned to it- 
Some teachers avoided this labor by securing enough copies 
of the list of authors to enable each pupil a tally his own 
books on his summary sheet. Tt was feared that this expedi- 
ent might lead pupils.to attach undue importance to reading 
hooks at the higher levels of maturity, but when it was 
clearly understood that the maturity figure was largely ап 
index of difficulty, and that there was no virtue in reading 
books that one could not understand, this f 
unfounded. 


The list enabled teachers to classify about 75 per cent of 
the fiction read by senior high school pupils. Other authors 
were classified by matching them with classified authors, 
were tallied as “unclassified.” If even 75 per cent of the fic- 


ear proved to þe 
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tion read by a pupil were classified, this was sufficient in 
most cases for an individual diagnosis of the direction of 
reading habits and tastes in fiction.” The list does not include 
enough authors commonly read by pupils in grades below the 
ninth to be discriminating beyond this point. 

No classification of non-fiction by maturity was attempted 
for several reasons. It comprised only about 10 per cent of 
pupils’ voluntary reading in most schools. It was too scat- 
tered to be easily classified. Thousands of different authors 
were read, but only a few by more than a handful of pupils. 
Frequently only parts of books were read, such as single 
poems, plays, essays, or chapters about a particular subject. 
Since so little non-fiction was read by the younger pupils, 
the mere number of books or of pages of non-fiction read 
proved to be a sufficient index of maturity for the purposes 
of the teachers involved. Any refinement of this simple 
Measure would have cost more in time and effort than it 
Was worth. 


The Magazine Checklist 

The record of two weeks’ reading, referred to above, 
Proved that a continuous record of magazine reading would 
be more burdensome than the result would justify. It also 
Seemed to indicate that the titles of magazines read would 
be sufficient for purposes of evaluation, without a list of the 
authors and titles of stories and articles in them. While some 
Magazines included a wide range of types of material and 
Maturity levels, most magazines were fairly homogeneous in 

oth respects, Furthermore, pupils read magazines rather in- 
discriminately, so that no safe inferences could be drawn 
from their choices of particular authors. 

When it was decided to sample magazine reading only 
nce or twice a year, it was found that pupils tended to for- 
a Por: a detailed presentation of the reading summary for one student 


See Wilfred Eberhart, “Evaluating the Leisure Reading of High-School 
Pupils,” The School Review, XLVIL (April, 1939), pp. 257-69. 
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get many of the magazines which they were known to have 
read during that period unless they were reminded by a 
checklist. In his Cooperative Study of Secondary School 
Standards, Eells had found that 108 magazines accounted 
for about 94 per cent of all the magazine reading done by 
17,338 representative high school pupils. These magazines 
were listed under the following headings: 


. Popular weeklies 

. Popular monthlies 

. Picture magazines 

. "Elite" magazines 

. Non-fiction weeklies 
- Monthly reviews 

. Classroom magazines 
. Popular science 

. Sports 

- Special interests 

1l. Youth magazines 


Soo -10 gu оюн 


. Detective, adventure, and true-story magazines 
- Motion picture and radio magazines 
14. Farm magazines 


Students were asked to check each magazine they had 
read in three columns: one indicating whether they read it 
seldom, occasionally, or regularly; another indicating whether 
they usually skimmed it, read parts of it, or read it in full; 
and a third indicating whether they obtained the magazine 
in school, at home, from a friend, a public library, a news- 
stand, or elsewhere. The last check had little significance for 
evaluation, but interested some teachers for other reasons 
and took almost no additional time, so that it was include 
for their sake. At the end of the checklist pupils were aske 


* Walter Crosby Eells, “What Periodicals Do Scho 
son Bulletin for Librarians (December, 1937 
Secondary Schools: 


ary School Standard: 


ol Pupils Prefer?" wil- 


). Reprinted in Evaluation g 
Supplementary Reprints. Cooperative Study of Secon 
5, 744 Jackson Place, Washi 


ngton, D. C. 
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to state what magazine they liked best, what magazines were 
received regularly at home, what magazines they had begun 
to read as a result of consideration given them in school, 
and what magazines they would like to have added to the 
school library. 

The maturity level of 29 of the magazines in this checklist 
was determined objectively by Wert by finding the average 
intelligence percentile, English placement score, and score 
on the Cooperative Contemporary Affairs Test of readers of 
each magazine among 4,763 students at Ohio State Univer- 
sity, the Universitv of Minnesota, and five smaller colleges 
in the Midwest? He converted these data into an index 
figure for each magazine by dividing the average score of 
its readers on each test by the average score of readers of 
the Saturday Evening Post. The unweighted average of the 
three quotients thus obtained yielded an index of maturity 
or "quality" for each magazine, ranging from about 40 for 
most of the *pulp" magazines to about 200 for The Nation 
and The New Republic. Abundance, variety, and concen- 
tration of magazine reading were studied as in the case of 
books, Although it was feared in the beginning that maga- 
zine reading would not be a significant index of reading in- 
terests, since pupils would tend to read whatever magazines 
Were received at home or in school, the variety of magazines 
read and its coincidence with other measures of reading de- 
Yelopment soon dispelled this fear. 


Newspaper Questionnaire i 
. In appraising students’ reading of newspapers it seemed 
Important to determine (1) what papers they read regu- 
arly or occasionally, (2) the amount of time devoted to 
newspaper reading, and (3) the sections of the paper which 
they read regularly. Since the newspapers read by students 


. "James E. Wert, “A Technique for Determining Levels of Group Read- 
136, Educational Research Bulletin, XVI, 4 (May 19, 1937), pp. 113-121, 
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were those published in their communities, no attempt was 
made to prepare a checklist which sampled the titles of 
newspapers. Instead, a newspaper questionnaire was devel- 
oped which provided spaces for the student to enter the 
names of the newspapers which he read and asked him to 
check the sections which he read regularly. Headings such 
as editorial, financial news, comics, book reviews, etc., were 
listed for him to check. The student was also asked to esti- 
mate the amount of time he spent each week in reading 
newspapers, and to indicate the editorial policy of each paper 
as “liberal,” “conservative,” “Republican,” or “Democratic. 

Few students were able to do the latter accurately. 


Radio and Motion Picture Checklists 


The experience of the Thirty Schools indicates that а 
checklist is a feasible device for gathering evidence of inter- 
ests revealed by choices of radio programs and motion pic 
tures. A list of the two or three hundred motion pictures 
which have appeared during a three-month period may be 
given to students with the request that they check each pic 
ture which they have seen and indicate their degree of lik- 
ing for it. In one such checklist used in the Eight-Year 
Study, recent motion pictures were listed alphabetically 
under the following headings: comedy, romance, historica 
musical, sports, documentary, Western, adventure, and туз” 
tery. Including the names of the principal actors in each pic" 
ture proved to be helpful in refreshing the student's memory: 
since titles often had little relation to the film. Students we? 
asked to check each film which they had seen and to judge 
its quality. Through the use of such a checklist, data can be 
secured concerning (1) the number of films seen, (2) the 
types of films seen, and (3) the opinions of students СОП" 
cerning the quality of the films. In addition, the level 0 


ê The motion picture checklists used i Ei эг еге Pre 
pared with the assistance of E x cp sein CP dice 
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quality, as judged by critics writing motion picture reviews 
in selected periodicals, can be determined for each film seen 
and a median quality level computed for the films seen by 
a student. ` 

Similar checklists are useful as a measure of the extent and 
character of the radio listening in which students engage. 
One checklist used in the present study’ lists the popular 
programs heard over national networks between four and 
ten p.m, and all day Saturday and Sunday under such head- 
M als, religious programs, 


ings as variety shows, comedians, seri 
s commentators, sports 


Classical music, dance music, news 
broadcasts. and discussion programs. It requests the pupil to 
check each program which he has heard in columns indicat- 
ing whether he likes it very much and listens to it whenever 
he can, likes it fairly well but does not go out of his way to 
listen. to it, or dislikes and avoids it. As with the movie 
checklist, a tabulation of responses reveals the programs of 
Various types listened to frequently and enjoyed most. Since 
both motion picture and radio checklists go out of date 
quickly, their usefulness depends upon their continuous 
revision, 

The radio checklist is obviously more than a measure of 
interest in radio programs. For the first time in history some 
9f the world's best music and a great deal of the world's 
Worst music are equally available to everyone, with a per- 
fectly free choice between them. The level of musical taste 
revealed by choices of radio programs is based upon a very 
extensive sample of voluntary behavior in a natural situa- 
tion. Studies in this field indicate that high school students 
are at least within earshot of a radio for an average of two 
Jours daily, They listen to the radio far more than they read. 

ence, radio preferences are one of the most valid, reliable, 
"M 2 radio checklists used in the Eight 

_ assistance of P, Keith Tyler, Director, 
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and sensitive indices now available of interests not only in 
music but in drama, current affairs, social problems, and the 
like. The radio is also unique among the instruments com- 
monly used by schools to discover interests in that it so 
readily brings to light undesirable interests, or interests that 
are at least unpromising and a waste of time. The possibili- 
ties of this medium of evaluation have only begun to be 


explored.* 
Validity and Reliability 


The problem of determining the validity and reliability of 
activity records differs from the case of paper-and-pencil 
tests. A test score is regarded only as an indication of how 
students would respond in an actual situation calling for the 
ability measured by the test. It therefore has to be demon- 
strated that the way in which students respond to the test 
is the way in which they habitually respond to appropriate 
life situations. The test maker ideally tries to get an accurate 
record of how students respond to such situations and com- 
putes the correlation of their test scores with these responses: 
Often this is not possible, so some other indirect measure, 
such as marks in courses, has to be used instead, but an ас 
tivity record is commonly accepted as the best criterion 
against which to validate a test. If the activity recorded i$ 
the objective, the only question of validity in the record © 
that activity is whether it is accurate. The only question o 
reliability is whether the record includes a large enoug^ 
sample of the behavior in question to make sure that it ÍS 
typical. If all the behavior relevant to a given objective were 
recorded, then there would be no question of reliability * 
all. Only when a small sample of behavior is taken do ws 
need assurance that it fairly represents the habitual behaviot 
of a given student. 

In the case of interest in reading, the behavior which 


8 " dca sisi 
f Many promising instruments have been developed by the Radio pivisi? 
of the Bureau of Educational Research, Ohio State University. 
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teachers were trying to develop was voluntary reading in 
books, magazines, and newspapers that was abundant, varied, 
selective, and increasingly mature. A record of such activity 
Was secured. If the record was accurate and complete, it was 
à valid measure of progress toward the objective, by the very 
definition of validity. The behavior recorded was the objec- 
tive itself—not an associated behavior which might or might 
not reflect the desired behavior accurately. 

To find out whether the record was accurate and com- 
plete, during 1940 a member of the Evaluation Staff inter- 
Viewed 51 students in the tenth, eleventh, and twelfth grades 
of a private, urban secondary school, who had been keep- 
ing rather extensive activity records as a part of their school 
Program. These records included reading in books, maga- 
zines, and newspapers, attendance at plays, operas, and con- 
certs, and choices of radio programs. The staff member ex- 
Plained that his interest was only in finding the facts about 
their records and that he had no academic connection with 
their school or with any college. He then talked informally 
with these students, asking them whether or not activities in 
Which they had not engaged ever were recorded, and 
Whether or not they recorded all the activities in which they 
engaged, ` 

All of the 51 students interviewed said that books which 
they had not read were never entered in the record. In most 
Schools in the Eight-Year Study, this was no more than 
Prudent, for nothing was to be gained by padding the list, 
ds the books recorded as read were discussed in confer- 
in Of the ten tenth-grade students interviewed, all said 

all the books which they read were consistently entered. 

the 22 eleventh-grade students interviewed, ten said that 
ла their reading was recorded. Of the A Ie 
s ents interviewed, three said that not all their reading 
M recorded, The students who said that not all their read- 
8 was recorded explained that “trashy” books sometimes 
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оз 


were not entered. These “trashy” books, they said, were 
chiefly mystery or detective stories. Also they explained that 
parts of books, such as single plays, poems, essays, or stories 
from a collection often were not entered. 

When the students were asked about the recording of 
motion pictures, their responses indicated that for many of 
them the motion picture record was quite incomplete. A few 
students who seldom went to motion pictures said their rec- 
ord was complete. However, most of the students said that 
not all the motion pictures which they saw were recorded. 
Some students said they consistently omitted recording the 
“poor” movies which they saw; some said thev omitted re- 
cording the second feature, that is, the one they did not g9 
to see, of a double feature program; some said that they 
often neglected to enter all the motion pictures which they 
saw, or forgot them and were unable to enter them. 

All 51 of these students said that their record of plays 
operas, concerts, etc., attended was complete and accurate: 
Such activities as attending plays and concerts, they eX 
plained, were important experiences and easily remembered: 
consequently all these were consistently recorded. 

These interviews led to the conclusion that for these 
students the record of books read was accurate in what it 
contained but that it was incomplete. This finding would de- 
mand caution in interpreting the summaries of some stu 
dents’ records of books read. The quantity of reading тер!” 
sented in these summaries would have to be regarded as 2 
minimum; the median maturity level of the fiction reat 
would have to be considered in error, probably in that ! 
would be too high. A second conclusion was that these 5107 
dents’ difficulties in keeping a continuous record of motio? 
pictures attended were so great as to make the use oA 
checklist technique a more desirable procedure. A thir 
conclusion was that for these students a record of play? 
operas, concerts, etc., attended could be kept easily and a€ 
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curately and represents a satisfactory method of securing 
evidence of participation in such activities. 

Three observations need to be made. One is that under 
certain conditions the technique of asking students to re- 
cord information about their participation in certain activi- 
ties can yield valid and reliable data for the appraisal of 
interests. The interviews cited above revealed that for most 
of the students it was reasonably certain that their record 
of books read was both accurate and complete. Second, it 
must be observed that the student’s attitude toward his rec- 
ord may be a crucial factor in determining the validity of 
the data. Recognizing this, the teacher should help students 
to understand aud accept the purposes of this type of evalua- 
tion and to remove as far as possible all academic or social 
pressure which would tempt students to falsify their records. 
Third, it is important to remember that the interpretation of 
data derived in this fashion should attempt to take into ac- 
count the conditions under which they were gathered. 

М The validity of the evidence secured by means of check- 
lists is dependent upon many of the same factors as is the 
validity of the evidence secured by means of continuous rec- 
ords. A checklist requires that а student recognize, rather 
than recall, those activities in which he has participated; 
thus it demands a less difficult task of the student. A check- 
list, however, often must present only a sample of the many 
Possible activities or materials and thus is dependent upon 
the adequacy of the sampling. The Checklist of One Hun- 

red Magazines, for example, presents to the student only a 
fraction of the total number of magazines which are pub- 
lished. There is evidence, however, that this sample is ade- 
{uate for determining the magazine reading interests of 


Secondary school students. Students, of course, may be dis- 
Onest in responding to а checklist. Again it must be pointed 
Out that the total situation must be considered in guarding 
against such dishonesty. There are no devices and no format 
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of a checklist which will compensate for a lack of rapport 
between teachers and students, for failure to prepare for the 
administration of such evaluation instruments, or for short- 
sighted use of data gathered in this fashion. 


Uses of the Instruments 


In making use of data gathered by means of activity rec- 
ords, one of the problems which teachers face is that of 
summarizing the data in such a fashion as to obtain a reason- 
ably precise, yet brief, description of the interests revealed. 
Summaries of certain activity records for two students will 
be presented in order to illustrate the kinds of information 
about students which they make available. 


Elizabeth 


Elizabeth read 15 books during the year. Fiction included 
Mary Johnston's To Have and To Hold, Churchill’s The Crisis, 
The Prince and the Pauper, Bertita Harding's Farewell 'Toinette, 
and Let the Hurricane Roar; two college stories, Iron Duke and 
College in Crinoline; one dog story; The Count of Monte Cristo; 
The Girl of the Limberlost, Anne of Green Gables. Non-fiction 
included The Boy's Life of Will Rogers, Life with Mother, Men 
Are Like Street Cars, and Daily Except Sundays. Eight of these 
books were read during the summer and seven during the school 
year. The class of students of which Elizabeth is a member read 
an average of 12 books during the summer and 24 books during 
the school year. She did not read books of as great difficulty and 
maturity as did the group as a whole. The fiction she read is dis- 
tributed over Levels III (e.g, The Crisis), II (e.g. Jock the 
Scot), and I (e.g., Girl of the Limberlost); whereas the median 
maturity level of the fiction read by the group as a whole is IV- 


In October, 1938, Elizabeth checked New Yorker as the only 
magazine she read regularly; 


in March, 1939, Life. In October, 
she was reading no magazine completely; in March, two—Lifé 
and Look. She was below the class median in the number of 
magazines read regularly and th 


à е number read completely. This 
evidence, together with the number of books hit 4 read, 
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Suggests that she does not like to read to an extent comparable 
with other students in her group. 

Elizabeth far exceeded most of the members of her class in the 
number of motion pictures which she attended. She recorded 
seeing 39 during the summer and 86 during the school year. The 
median number of motion pictures attended by students of her 
class during the school year was 27; the range, 0 to 99. Also, she 
Saw many of these 86 different motion pictures more than once. 
Evidently, then, a large amount of her leisure time was spent in 
viewing motion pictures. During the year, Elizabeth saw two 
plays: The Boys from Syracuse and Abe Lincoln in Illinois, and 
attended a performance of The Mikado. The median number of 
plays, Operas, and concerts attended by students in her class, 
however, was five. 

Elizabeth’s five favorite radio programs in December, 1988, 
Were Benny Goodman, Bob Crosby, Kay Kyser, Make Believe 
Ballroom, and Tommy Dorsey. Of the 19 programs which she 
checked as the ones she listened to regularly, seven were dance 
Orchestras such as the ones listed as favorites. In addition to 
dance music, she listened regularly to five variety programs, 
three question and answer programs, two dramatic programs— 
Big Town and Lux Radio Theatre, and to Walter Winchell and 

immie Fiddler, Elizabeth was approximately at the median of 
er class in the number of programs she heard regularly. 


Claire 


Claire read ten books during the summer and 35 during the 
Schoo] year. Five of these books read during the school year were 
Collections of plays, such as The Theatre Guild Anthology; two 
Were volumes of poetry; two were discussions of political and 
Social problems; and four were books about journalism and the 
Writing of short stories. The fiction she read during the school 
Year included two volumes of short stories and such novels as 

rums Along the Mohawk, My Antonia, House of Seven Gables, 

Ouse of Exile, Mary Roberts Rinehart’s The Doctor, and Gone 
With the Wind. More than half of Claire’s reading was devoted 
fo non-fiction, whereas for her class as a whole approximately 

Per cent of the titles were non-fiction. Also she read more than 
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the average number of books during the school year. The fiction 
which she read was of Levels III, IV, and V; this indicates that 
she was reading books of approximately the same maturity as was 
the group as a whole. 

Claire checked eight magazines as those which she read regu- 
larly in October, 1938; and ten in March, 1939. These numbers 
are considerably above the group medians. In October she 
checked six magazines as the ones which she read completely; in 
March, five. Again, these numbers are above the group medians. 
The magazines which she read were American Home, Better 
English, Life, New York Times Magazine, Reader's Digest, Rider 
and Driver, Quiz Digest, and Time. 

During the school year Claire saw 18 different motion pictures; 
one of these, Grand Illusion, she saw twice. Some of these pic- 
tures which she liked very much were Grand Illusion, Four 
Daughters, Young Doctor Kildare, A Man to Remember, The 
Sisters, Brother Rat, Scarface, Gunga Din, Stage Coach, Made for 
Each Other, and Irene and Vernon Castle. Her comments about 
the motion pictures which she saw and the list of pictures which 
she liked suggest that she chooses her motion picture entertain- 
ment with some care. 

In addition to these motion pictures, Claire attended three 
plays, Abe Lincoln in Illinois, American Landscape, Outward 
Bound; and three musical performances, The Boys from Syra- 
cuse, Ballet Russe, and The Hot M ikado. This is slightly above 


the class median of five. Her activity record also records visits to 
several museums and art galleries. 


In December, 1938, Claire checked eight radio programs as 


those which she listened to regularly. These included the Colum- 
bia Workshop, three programs of classical music, Information 
Please, two news commentators, and talks on politics. This num- 


ber is much smaller than the median number of programs heard 
regularly by the group as a whole. 


The leisure-time activities of these two students present 
two quite different pictures. One has its chief emphasis on 


activities such as attending motion pictures and listening to 
the radio with very little emphasis on reading experiences; 
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the other presents quite a different pattern. The one reveals 
interests which might be characterized as the more “popular” 
ones, while the other reveals interests which might be char- 
acterized as much more intellectual. 

Data such as those presented in these illustrations should 
be of use to teachers who are concerned about the pattern 
of interests which students are developing. In order to use 
such data most effectively, it is important for the teacher 
to determine what kinds of interests he considers desirable 
for the student or the group of students, to exercise care in 
gathering the evidence, and to summarize this evidence in 
& convenient fashion. Cumulative summaries have several 
advantages, One is that changes which take place over a 
longer period of time may become evident. Another is that 
Such summaries mav be passed on from teacher to teacher 
as the student moves through school. Such summaries prob- 
ably should not be as lengthy as the illustrations given here. 
However, data in tabular form similar to that suggested for 

Ooks can be recorded and cumulated by students. Summary 
Comments about the pattern of interests revealed, changes 
Observed, and the directions in which future changes should 
take place might then be added by the teacher with rela- 
tively little effort. 

One further suggestion about the use of such data seems 
Warranted, Whenever possible, other evidence should be 
Combined with the evidence supplied by such summaries in 
Order to provide a more comprehensive description of the 
Student's interests, The observations made by teachers both 
in and out of the classroom, evidence from other instruments 
Such as the Interest Questionnaire described in this chapter, 
and the like, should prove useful either in corroborating 
lypotheses or in revealing inconsistencies which need care- 
ul study in order to arrive at a clearer understanding of the 
Student. 
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Tue Interest INDEx 8.24 


In addition to records of activities, the questionnaire has 
also been found useful as a method of studying students’ in- 
terests. In order to investigate the possibilities of this tech- 
nique, a questionnaire was developed which listed three 
hundred activities which students were asked to mark “Like,” 
“Indifferent,” or “Dislike.” The questionnaire sampled activ- 
ities which were expected to reveal interests fostered by 
school subjects as well as interests in certain types of rela- 
tionships with other people. 


Method of Selecting Items for the Questionnaire 

The list of activities in the questionnaire was prepared by 
staff members who were concerned with evaluation instru- 
ments in the various academic fields. Each staff member ex- 
amined current textbooks and analyzed classroom activities 
in order to identify activities which might indicate an inter- 
est developed by his field. Each activity submitted was ex- 
amined critically by the entire staff to make sure that it 
fairly represented the interests developed by these fields 
and that it was actually carried on by students. All activities 
in which a student was apt to engage as a part or result of 
his work in several subjects were either eliminated or so 
sharpened that thev became more clearly related to one field 
only. An attempt was also made to include items indicative 
of varying degrees or different depths of interest in a field: 
from easy and attractive activities to those involving con- 
siderable effort, hours of study, a high degree of proficiency; 
etc. 


The items thus selected were arranged in random order 


in an inventory which was used experimentally in several 
grades in 


20 of the schools participating in the Study. On 
the basis of the experience of staff members who interpreted 
the findings to the faculties of these schools and in the light 
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of criticisms of teachers who felt that some of the areas had 
not been adequately sampled or that the vocabulary of some 
of the items was confusing, the questionnaire was revised. 
This revision was also based upon an item analysis and re- 
liability studies of the responses of 250 boys and 250 girls 
in typical high schools. 


The Revised Form: Interest Index 8.2a 
The revised form of the questionnaire consists of only 200 
items and thus can be given in one study period in a junior 
or senior high school. The areas selected for this question- 
naire are: social studies, biology, physical science, English, 
foreign languages, mathematics, business, home economics, 
industrial arts, fine arts, music, and sports. In addition to these 
areas, two larger categories which cut across most of them 
Were included: reading and manipulative. These two cate- 
Bories are composed of items which appear in the above 12 
Categories and involve either reading or handwork. Thus, 
for instance, “To make and classify a collection of insects" is 
Classified under biology and also under the manipulative 
Category, The item: "To read such books as The Life of 
Pasteur, Microbe Hunters, Arrowsmith, etc." is classified 
under biology and also under reading. There are 16 activ- 
ities in each of 11 of the above categories, 24 in social 
Studies, 35 in reading, and 38 in manipulative. The sort of 
items included is indicated by the following sample. The 
Parenthesis after each item indicates how it is classified in 

Scoring, 
l. To write stories. (English) 

8. To go on trips with a class to find out about conditions 
Such as housing, unemployment, etc., in various parts of 


your community. (Social Studies ) 
5. To visit stores, factories, offices, and other places of busi- 


ness to find out how their work is carried on. (Business ) 
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6. To correspond in a foreign language with a student in 
another country. (Foreign Language) 

7. To play baseball (either hard or soft ball). (Sports)? 

14. To learn how to cook well (in camp or at home). ( Home 

i Economics) (Manipulative) 

15. To sing in a glee club, chorus, or choir. (Music) 

16. To put eggs into an incubator and open one every day to 
see how the chick develops. (Biology) (Manipulative ) 

17. To sketch or paint. (Fine Arts) (Manipulative) 

21. То make chemical compounds. (Physical Sciences) 
(Manipulative ) 

22. To make things of wood, metal, etc. (Industrial Arts) 
(Manipulative ) 


. To do the arithmetic necessary in planning trips or parties 
for the class. (Mathematics ) 
Interpretation of the Questionnaire 


As indicated on the data sheet on page 341, the scores give 
the per cent of each student’s “likes” and “dislikes” in each 
of the categories and the per cent of his “likes” and “dislikes” 
for the whole questionnaire: i.e., for the 200 items. The per 
cent of items marked “indifferent” is not recorded but may 
be obtained by subtracting the sum of the “likes” and “dis- 
likes” in each category from 100. The Data Sheet also gives 
the lowest and highest scores and the group median for 
“likes” and “dislikes” in each category. 

This instrument is so simple in construction that. it has 
been found that teachers learn to interpret it in a short time. 
As with most instruments, persons with 


may get more from it than 
As long 


learn 


greater experience 

r persons with limited experience. 

as the interpreter confines himself to what he may 

| about the general direction of a student’s interests, the 

реш is simple and rather reliable. If, however, 2 

person attempts to find what effect a given course offered 
? Sports were not classified as "Mani 


А ulative” because they were so neatly 
universal interests that they did not ident oce e 
e idu mapalaki, y students whose interests wer 
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in the school had upon the change of interests of a group 
of students, certain complications arise, and rather advanced 
statistical treatment of the data becomes a necessary condi- 
tion for arriving at valid conclusions. In the following pres- 
entation of the method of interpreting results, the relatively 
simple methods will be described. 

Each student's scores are interpreted in relation to the 
group median and group range and in the light of his own 
Scores on other categories, e.g., his own pattern of scores. 
The examination of scores of a student in relation to the 
group median and the range for each of the categories of 
summary will indicate in which areas the student has high 
or low likes or dislikes, thus establishing tentatively the de- 
viate points in his preferences or dislikes. Thus, comparing 
Chester's scores with the group medians, one notices high 
dislikes in many areas and high likes only 
Howard has high likes in most areas 
of them. 


One may further note the relative frequency of the sig- 
nificant likes and dislikes and the areas in which they occur. 
At this point it is helpful to examine the scores in terms of 
certain broad common elements in the pattern of likes and 
dislikes to locate the significant tendencies and character- 
istics of the student's pattern of interest. Thus a frequency 
of high likes in English, social studies, foreign language, and 
reading indicates high preference for verbal activities. High 
likes in biology, physical sciences, mathematics, and indus- 
trial arts indicate interest in activities involving things and 
precision manipulation. An artistic pattern is suggested by 
high likes in music, fine arts, industrial arts, and home eco- 
nomics. High likes in Sports, business, industrial arts, home 
economics, and manipulative activities would suggest an in- 
clination toward practical activities. [f likes in one pattern 
are accompanied with dislikes in а contrasting one, a further 
reinforcement of a personal selection of activities is indi- 


in three, whereas 
and few dislikes in any 
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cated. Thus, if fairly high likes in English, social studies, 
foreign language, and reading are accompanied by dislikes 
in biology, physical sciences, and mathematics, a fairly strong 
case of verbal interests is indicated. 

It must be noted, however, that these general patterns are 
nothing more than suggestions for exploring general tend- 
encies. The areas liked and disliked group themselves in in- 
numerably diversified ways in any individual case, and it 
is therefore neither possible to describe all of the possibili- 
ties, nor wise to attempt to define any one pattern precisely 
or to follow its implications in any one individual case 
slavishly, 

Applying this method to the scores given above, one may 
note that Chester has a negative reaction to all academic 
activities, verbal and scientific alike. Music is the only area 
of high positive interest to him. In contrast, Joseph has a 
high interest in academic activities of all types, but shows 
high dislikes in such practical areas as home economics and 
business, and sports. Josephine’s preferences run predomi- 
nantly in the direction of verbal activities, with an additional 
interest in music and business, with no dislikes in any area 
but Sports. Howard’s interest pattern is so catholic as to 
arouse a suspicion of lack of discrimination. 

In addition to examining the scores of a student in rela- 
tion to those of other students in his group (i.e., examining 
them on the background of the group’s scale), one must also 
examine these scores in terms of the student's own scale. Some 
Students have high likes in many categories, others have low 
ikes in most categories, or generally high dislikes. The total 
Score on “likes” and “dislikes” is indicative of the general 
tendency of the student in terms of which his scores have 
to be examined. For instance, a student may be one of the 
tighest in the group in liking music; if, however, all of his 
1Кез are high, and on his scale music is one of the lowest, 
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a different meaning is attributed to his score than if we con- 
sider it only with reference to the group score. 

Thus in the case of Josephine, the score of 50 on disliking 
sports assumes great significance, because of the general ab- 
sence of dislike reactions. Similarly, Chester’s high dislike 
of mathematics, being part of a pattern of disliking all aca- 
demic activities, needs to be viewed as a part of this total 
negative reaction, rather than as a specific reaction to mathe- 
matics. The fact that Howard’s likes are uniformly high re- 
quires an investigation to see whether these are genuine in- 
terests or whether some such extraneous factor as lack of 
discrimination combined with a benevolent disposition is not 
playing a part. 

One thing to be remembered in interpreting these scores 
is that interests are personal, and therefore a certain degree 
of uniqueness is both to be expected and desired. Therefore 
both the range and the pattern of interests should be judged 
in personal terms rather than by general norms. Thus, while 
a certain breadth of interests usually is desirable, it would 
be a mistake to assume that high likes in all areas indicated 
in the questionnaire is to be expected or is even desirable. 
Similarly, while negative reactions on the whole may be 
considered undesirable, one should expect individuals with 
selective interests to react negatively to some activities, while 
showing high positive reactions to others. 

In examining group patterns, similar methods need to be 
applied. Thus one may note the areas in which there are 
tendencies toward positive or negative reactions. This can 
be observed by comparing the medians with the medians 
of other groups or by noting the frequency of high likes and 
high dislikes in any given area. By this method one may 
note the prevalence of preferences in such verbal areas as 
о English, and the like, or negative reactions 
t "as ot artistic activities. There also it is important to 
bear in mind that a valid interpretation cannot be secure 


| 
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by simply noting the areas of high likes or high dislikes. 
These observations must be scrutinized in terms of the total 
pattern as well as in terms of other data on the same group. 
Thus a relatively high preference for physical sciences has 
one meaning when this is the only area of high preference, 
and a different one when it is one of many. High preference 
for foreign language in a group with no organized experience 
in this field and no special aptitude in this direction. usually 
suggests wishful thinking while the same pattern for a group 
with verbal ability and experience in this area can be taken 
to mean a thoughtful and actual interest. 


Value of the Questionnaire to the Counselor or Teacher 

The counselor will be interested chiefly in the configura- 
tion of the student’s per cent of likes, indifferences, and dis- 
likes in the various categories. The important point to note 
here is whether the picture is consistent with what is known 
about the student’s inclinations and interests, and if some 
inconsistency is discovered, this lead should be investigated. 
When considered in connection with other information avail- 
able, it should be helpful in academic or vocational guidance. 

Thus the preference pattern of the student suggests the 
areas which can be utilized for his further development. If 
it seems broad enough, and sensible enough for a given stu- 
dent, it suggests the line of activities for him to carry on 
and by which he will be enriched. If an undue narrowness 
is indicated, the spots of positive reactions can be mobilized 
as a springboard for expansion of interests. Thus high inter- 
est in physical sciences would suggest that reading in that 
area could be used to develop interest in reading, should 
that be lacking, Similarly, the pattern of negative responses 
should suggest to teachers the areas in which remedial action 
may be needed or in which direct pressure should not be 
applied. Thus it would be futile to try to develop good work 
habits in English in the case of an individual with negative 
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responses to this area until a more positive reaction is devel- 
oped. Other types of activities should be used to this end. 

Since motivation is an important factor, the evidence in 
interests is also useful in explaining other facts about the 
students, such as high or low achievement in various areas, 
behavior in class, or activities in thinking. 

The classroom teacher may be interested also in the kinds 
of activities which a given student likes or dislikes or to 
which he is indifferent, within particular. subject-matter 
fields. Specific responses to individual items may be exam- 
ined for this purpose and new or more subtle patterns than 
those revealed in the category scores may become evident. 
It should be noted that the emphasis in this type of exam- 
ination of responses is not on the amount of interest which 
à student may have, but on the nature of that interest. One 
may find, for instance, on examining the scores that a stu- 
dent is at the group median in liking biology; on his own 
scale biology is neither particularly high nor low; but when 
his specific responses in this category are examined, one 
may find that his liking is centered on items which have to 
do with people, human physiology, health, etc. This knowl- 
edge should be of value to the teacher. 

The classroom teacher may also make a similar use of the 
responses of the group. The evidence on prevailing prefer- 
ences is helpful in planning classroom activities, areas to be 
studied or the approach to be taken. Thus exploration of 
printed material may be a very good way of studying а 
given topic for one group, while other sources must be used 
with. groups who have a high negative reaction to verbal 
activities. Diagnosis of group preferences and dislikes also 
points to gaps in the curriculum to be filled, or unwise em- 
phases in the present curriculum, Thus in one school an ex- 
tremely high negative preference was shown for art activ- 
ities. The examination of their curriculum revealed that this 
group had no opportunity in this field and could well profit 
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from it. In another case, an unusually high negative reaction 
to writing was traced to a large amount of required writing 
resulting from separate assignments by several teachers, 
each of whom was unaware of the total load on the students. 

As in case of the individuals, the hypotheses regarding 
constructive action to be taken cannot be formulated validly 
by using the data from this questionnaire alone. These data 
are descriptive and as such are helpful only in suggesting 
hunches regarding the causes of preferences or of dislikes, 
yet for a remedial or constructive program it is necessary to 
have a fairly good idea of the cause of the interest pattern 
shown. Therefore it is imperative to consider these data in 
context of other evidence before decisions are made regard- 
ing what to do about an individual or a group. 


Factors Influencing Accuracy of Results 

The usefulness and accuracy of results of this instrument 
depend on at least two factors: the degree to which the items 
Sample activities which are affected by the curriculum in 
the school in which the instrument is used, and the sincerity 
of the response made by the students. 

The first of these may be determined by a careful exam- 
ination of the specific items by the teachers who expect to 
use the instrument. If it is found that the items do not sam- 
ple activities which reveal interests that they are trying to 
develop, or activities to which they would like to know their 
students’ reactions, a similar instrument can easily be con- 
Structed which includes both. 

The responses of the students will be most sincere if the 
instrument is not regarded as a “test” in which high scores 
are desirable, If the students recognize that the information 
which they convey through the questionnaire may be helpful 
in planning class work, their cooperation can be readily 
enlisted, Р 

In making interpretations it should be remembered that 
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in this instrument the student is asked to tell how he feels 
about certain activities: whether he likes them, is indifferent 
to them, or dislikes them. These feelings are not necessarily 
an index of his performance in any of the areas sampled. 
A student may do poor work in class and still like many of 
the activities listed. Likewise a student may do very well in 
class and dislike many of the items. The reasons for this 
seeming discrepancy may be worth exploring. 

For certain types of interpretations it is advisable to com- 
pute averages for boys and for girls separately, although this 
greatly extends the scope of the statistics which are needed. 
The mean, standard deviation, and coefficient of reliability 
of each category for the “like” scores from one sample popu- 
lation of 542 eleventh grade students are given in Appendix 
V. Reliability coefficients computed by the Kuder-Richardson 
formula for this sample range from .79 to .92. The median 
coefficient is .89, and only three categories are below .85. 

A more thorough discussion of the interpretation and pos- 
sible uses of this technique will be found in the next chapter. 
It will also be seen there that the study of interests can be 
used for a different purpose, namely the evaluation of per- 


sonal and social adjustment. The validity of the instrument 
will be treated in this connection. 


Chapter VI 


EVALUATION OF PERSONAL AND SOCIAL 
ADJUSTMENT 


а hE IAEA 
DISCUSSION OF THE OBJECTIVE 


History of the Objective 

One of the concerns voiced by the schools cooperating in 
the Eight-Year Study was that of promoting the personal 
and social adjustment of their students. In an effort to clarify 
the meaning of these terms and to devise ways in which at 
least a few of the aspects of personal and social adjustment 
might be appraised, groups of teachers and of specialists in 
Various pertinent fields met together. The Committee on the 
Study of Adolescents of the Commission on Secondary 
School Curriculum of the Progressive Education Associa- 
tion, for example, provided special help in attempting to 
clarity the meaning of this objective. The study of the ways 
in which the schools were gathering and recording evidence 
Of students’ adjustment revealed that many techniques of 
appraising personality and social adjustment, though they 
suffered from one shortcoming or another, were of promise. 
The work of the regional committees on anecdotal records 
Was especially helpful in pointing to ways in which teachers 
might collect evidence which would give some insight into 
Ne personality problems of students. 

| Urged by the cooperating schools to devise more prac- 
ticable means of appraising personal and social adjustment, 
the Evaluation Staff began an extensive study of this problem 
of appraisal early in 1938. Before the results of this study are 
Presented, however, it will be necessary to attempt to dis- 
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tinguish between personal and social adjustment, to clarify 
the concepts of adjustment, and to attempt to set up a list of 
criteria for a method of appraisal. 


Differentiation between Personal and Social Adjustment 
Personal adjustment is thought of broadly as including the 
subjective feelings of the individual, such as feelings of ade- 
quacy and inadequacy, personal happiness and unhappiness, 
the adjustive reactions of the individual, the presence or 
absence of inner conflicting tendencies. Social adjustment is 
thought of as being directed toward the adequacy and effec- 
tiveness of a person’s interaction with other people in face- 
to-face situations. Relationships with age-mates, older and 
younger people, with the opposite sex, etc., are included 
under this heading. It also includes the person’s attitudes to 
the mores and standards of the group of which he is a 
member. It is recognized that the division between personal 
and social adjustment is, in some respects, an artificial one 
and that they should be thought of as being intimately con- 
nected and interrelated and as representing two aspects of 
the emotional adjustment of a person to his environment. 


Discussion of “Adjustment” 


There appears to be considerable difference of opinion 
about what constitutes adjustment. Because this term lacks 
clarity and may have different meanings to different persons, 
it is necessary to attempt to clarify the p 
adjustment which underlies the s 
chapter, 


Broadly speaking, the investigators regard personality as a 
s cete aa meters rt ei 

| ‚ Since personality is viewed as а 
product of the interaction of forces within the individual 
and the interaction between the individual and his surround- 


ings, it must be seen in the light of his past history and 
against the background of his present environment. 


articular concept of 
tudy to be reported in this 
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The point of view underlying this investigation may be 
clarified somewhat by indicating how it differs from the ap- 
proach which has governed certain other attempts in this 
field. The idea that certain behaviors, in and by themselves, 
are indicative of “good” or “poor” adjustment seems to be 
rather widely accepted. This point of view has been made 
the basis of a number of attempts to appraise students’ ad- 
justment. The procedure involves the construction of a be- 
havior scale which lists sample statements of both “good” 
and “bad” behaviors. The mere counting of these behaviors 
is expected to give an adjustment score or index for the 
student.! 

Such classification of behaviors as "good" or "bad" in 
themselves is a relatively simple attack upon the problem. 
It leaves out important factors which need to be considered 
prior to arriving at a judgment regarding the person's adjust- 
ment or maladjustment. Two major criticisms may be made 
of this concept of adjustment. 

It is an oversimplification which omits consideration of 
the individual, his motivation, surrounding temporal and 
environmental conditions, etc. The courts, for example, do 
not hold that certain acts constitute a crime everywhere and 
under all circumstances. Before evaluating an act, a careful 
Study is made of the motivation of the indicted person, con- 
Sideration is given to the extenuating circumstances, etc. The 
final judgment is also made in the light of the history of the 
behavior of the person. Likewise, when parents or teachers 
judge the behavior of children, they are aware of the neces- 
Sity of attempting to determine not only what was done but 
also why it was done, under what circumstances the behavior 
Occurred, and the like. 

Furthermore, such a classification of behaviors as “good” 

1 For a discussion of the present status of personality measurement and 
of the difficulties involved, the reader is referred to Chapters I and II of 


Fulcra of Conflict, Douglas Spencer (New York, World Book Co., Yonkers- 
9n-Hudson, 1939). 
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and “bad” in themselves suffers from another oversimplifica- 
tion—that of not discriminating between the condition and 
the symptom of the condition. This may be clarified by the 
following analogy: an infection may be said to be a condition 
or a state of an organism, whereas the high fever which is 
apt to accompany the infection, is an outcome бе Symptom 
of the infection. Although the fever is indicative of an infec- 
tion and therefore represents something undesirable, never- 
theless in itself and under the circumstances it is believed to 
be a desirable adjustive reaction of the organism to the in- 
fection. In making lists of undesirable behaviors there is a 
tendency to use both kinds of behaviors—those which may 
be thought of as “conditions” as well as those which may be 
thought of as "symptoms"—and to neglect the fact that they 
are phenomena of an entirely different order and that they 
have to be evaluated differently. 

Thus, there appear to be cogent reasons against beginning 
a program of appraisal of adjustment with the focus of the 
inquiry centering on an attempt to determine whether the 
adjustment of the individual is desirable or not, Determina- 
tion of what specific behaviors may constitute “desirable 
adjustment” for a given individual is legitimate only at the 
end of a study of a personality, when the judgment can be 
based on a great many considerations. Even then it is apt to 
be a value judgment. Obtaining a picture revealing how the 


individual functions, what adjustive devices he employs, 
seems to be of greater value. 


Another rather commonly 
adjustment consists largely in 
and demands. This point of vi 
adjustment in terms of onesel 


accepted point of view is that 
conformity to social standards 
ew neglects the importance of 


f, i.e., the importance of being 
able to handle satisfactorily one's own impulses and strivings: 


the importance of being consistent with oneself. It must be 
borne in mind that the lack of this type of adjustment ex- 
presses itself frequently in a variety of serious overt or veiled 


APPRAISING STUDENT PROGRESS 353 


emotional disturbances.” In this connection the following 
may be said regarding what must be included in thinking 
about adjustment. On the one hand, we have the individual 
with his native needs, impulses, and drives which seek satis- 
faction, and which undergo certain changes with age. On 
the other hand, we have society which has its needs and 
which makes certain demands on the individual. These de- 
mands on the individual vary in different cultures and de- 
pend on the age and sex of the individual, social status of 
the family, and similar factors. Maladjustment of the indi- 
vidual thus may be, broadly speaking, one of two kinds. In 
one instance the individual may comply to such a kigh de- 
gree to the demands of society that his native drives become 
thwarted, cramped, and distorted. In such cases the indi- 
vidual's behaviors with regard to society are acceptable to 
Society, but he pays too high a price for them himself. In 
such an event some neurotic condition, accompanied by a 
£ood deal of anxiety and considerable personal unhappiness, 
may be found in him. In the second type of maladjustment 
the individual rebels against society, its demands and re- 
Strictions. In extreme cases such a person may suffer from 
Society’s ostracism or other types of punishment, but his diffi- 
culty, nevertheless, will be largely one of social adjustment. 
This is, of course, an oversimplification of the picture, yet 
for a broad frame of reference it is sufficiently correct. It 
Permits us to see that in general optimum adjustment may 
be thought of as a compromise between the individual and 
the group to which he belongs, in which each party adjusts 
to the other to a certain extent in order to avoid conflicts 


* The fact that educators are prone to regard as the most serious prob- 
ems those of non-conformity, and to underestimate the importance of prob- 
ems which are not brought to light through anti-social behavior, has been 
CmOnstrated in a number of studies. The best known of these is E, K. 

ickman, *Children's Behavior and Teachers' Attitudes," The Common- 

Wealth Fund, 1928, 
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within the individual or clashes between the individual and 
the social group. 

Desirable adjustment for the individual may then be 
thought of as a process of maturation and adaptation during 
which he is able to integrate successfully (i.e., without neu- 
rotic compromises or anti-social acts) his native impulses 
and drives with those expectations or demands which are 
imposed upon him (with reference to his age, sex, social 
status, race, etc.) by the group to which he belongs. 

The above discussion leads to the formulation of the fol- 
lowing point of view: 

1. The adjustment of the individual must be conceived as 
a complex of feelings and behaviors which are meaningful 
only when seen in relationship to each other, rather than as 
a series of discrete behaviors regarded as meaningful in 
themselves. 

2. This complex of feelings and behaviors must be evalu- 
ated in terms of the status of the individual (i.e., his age, 
sex, position in society, etc.). The same behavior may be 
evaluated differently when observed in the case of a six-year- 
old and a Sixteen-year-old, in a boy or ina girl. 

3. The adjustment of the individual must be considered 
in terms of the relationships between his own strivings, pur- 
poses, and past conditionings, and also in terms of the rela- 
tion of these to the demands or expectations of society. His 
adjustment must be viewed as a process rather than a state. 


DISCUSSION or THE TECHNIQUE OF APPRAISAL OF THE 
OBJECTIVE 
Desirable Characteristics of an Instrument for 
Appraising Personal and Social Adjustment 


Being well aware of the impossibility of evolving any 
single device for appraising all of the pertinent factors which 
need to be considered in the evaluation of the life adjust- 
ment of an individual, the staff set out to explore feasible 
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ways of appraising at least a few of these factors. During this 
process of exploration an effort was made to define the gen- 
eral characteristics which were felt to be desirable in an 
evaluation instrument for this purpose. 


І. It should be a technique applicable to a large number 
of students at one time. 

Since the paper-and-pencil technique is much more eco- 
nomical, as far as the examiners time is concerned, than 
the interview, anecdotal record, etc., and thus permits testing 
a larger number of students at the same time, and since it 
rules out one of the possible subjective factors—the biases 
of the observer—this technique was thought to be preferable. 


2. The evidence obtained from different individuals 
should be comparable. 

It was felt that the form in which the data were to be col- 
lected should be such that there would be an opportunity 
for comparison of results. To the extent that the response- 
Pattern of one individual can be compared with that of an- 
Other or that of a group, it should be possible to discover 
those ways in which he is similar or dissimilar and thus gain 
further insight into how his personality is organized. Com- 
parability of results might also lead to investigation of group 
phenomena, 


3. The technique should be indirect. 
А In devising an appraisal instrument it was considered very 
™portant that the approach be relatively indirect. One diffi- 
culty which is implicit in inventories which attempt to get 
at the individual's private and intimate feelings is the fear 
and anxiety which most people experience when they feel 
that they are being “tested” or evaluated personally. Whereas 
they frequently seem able to consider certain abilities as 
actually extraneous to themselves and are, therefore, not 
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threatened when an attempt is made to measure these abil- 
ities, they usually feel defensive about obvious attempts to 
get at their private feelings. The anxiety aroused may be so 
great as to completely inhibit or invalidate the response. 
Thus, it was felt that the instrument should not be obviously 
a “Personality Test” but rather should attempt to appraise 
personal and social adjustment in a more indirect manner. 


4. The subject should be called on to express himself 
rather than to appraise himself. 

In addition to the fact that a great deal of anxiety is 
aroused by the demand for self-appraisal, it is also a matter 
of general psychological knowledge that few persons are 
capable of objective self-evaluation with regard to their emo- 
tions and personalities. Attempts to make a subject evaluate 
himself and his own emotional reactions presume a knowl- 
edge of self which is lacking in most individuals. With this 
consideration in mind, it was decided that asking the subject 
to appraise himself should be avoided; instead, he should 


be given an opportunity to express himself in a number of 
different ways. 


5. The instrument of appraisal should provide a varied 
response—a field upon which the subject can express 
himself. 

This method of appraisal differs somewhat from one of 
the common conceptions of a test. In many tests the subject 
is given a problem which is presumably comparable to a 
life situation and his performance in attaining the solution 
of the problem is interpreted as a measure of his ability to 
cope with an analogous situation in life. In an instrument 
which attempts to appraise personal and social adjustment, 
however, it was felt that it might be undesirable that the 
problems be thus limited by the examiner rather than re- 
vealed by the individual. It seemed that the most desirable 
technique to use would be that of presenting a large variety 
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of stimuli to which each individual might react emotionally 
in a variety of ways, thus providing a field, so to speak, upor 
which the individual might draw his own design. This 
means, also, that there should be opportunity for an ex- 
tremely large number of configurations of response, in order 
that each individual might have the maximum practicable 
opportunity to project his personality. Single responses, then, 
would have meaning chiefly as they became a part of a 
larger pattern. Each response could be interpreted in the 
light of every other response. Whereas it is not possible to 
provide a field so large that an individual can express his 
whole personality, even a limited field in which the inter- 
relationships are traceable is apt to provide a great deal of 
useful material. 


6. It should give the individual pattern of the person- 
ality of the subject. 

In order to get at the more detailed picture of the per- 
sonality, one has to guard against the use of too broad classi- 
fications, such as “sociable” and “a-sociable.” Such classifica- 
tions tend to obliterate individual differences and to be useful 
only in very extreme cases. It was thought desirable that an 
appraisal instrument give a description aiming at something 
more than a rough categorization of the personality. This 
description, if it is to be useful to educators, should go be- 
yond what is readily observable in a classroom situation. It 
should lead to deeper insights into the individual, his motiva- 
tion, his system of subjective meanings attached to things, 
his values, ete. Understanding another person is an under- 
Standing of this person’s acts in terms of his feelings and not 
in terms of the feelings of an outsider. 


T. It should be open to interpretation at different levels. 


It was felt'that to demand from the interpreter a certain 
degree of psychological understanding is legitimate. On the 
other hand, it was felt that the instrument should not be so 
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complicated that only a person with specialized training 
could interpret it. Ideally such an instrument should give re- 
sults which would permit deep interpretation by persons 
with a good deal of training and experience and still yield 
some useful material to persons with limited training. 


Exploratory Studies 


І. Use of the Interest Questionnaire 


While the above criteria for a technique of appraisal of a 
personality were being considered, several exploratory 
studies were conducted with tests devised by the Evaluation 
Staff for other purposes. It was thought that since personal 
and social adjustment was intimately related to these other 
areas, a great economy would be achieved if it were found 
possible to draw inferences for the present objective from the 
results of other tests. Moreover, such an approach would be 
ideal from the standpoint of indirection. : 

Of all the tests examined from this angle, the first Interest 
Questionnaire, Form 8.2, gave the best results. This ques- 
tionnaire provided data on the students’ feeling reactions to 
800 activities commonly carried on in school? The students 
responded to the items in terms of like, indifferent, dislike. 
In an exploratory Study an attempt was made to discover 
what kinds of things and how many one might say about the 
personal and social adjustment of 33 college students, using 
the data from this questionnaire. The students selected for 
study were attending an institution which was known to have 
elaborate and detailed records on its students. 

The descriptions written from the questionnaire results 
were compared with teachers’ ratings of these students on a 
Descriptive Trait Profile? a rather flexible personality rating 
scale devised for the purpose of validation of this study. 

2 This questionnaire has Since 


Index 8.2a, is described in the ch 
* P.E.A. 2968 (mimeographed 


been revised. The revised form, Interest 
apter on Interests, 


Je University of Chicago, Chicago, Ill. 
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Each student was rated by four teachers. Although valida- 
tion through a comparison of descriptions of personalities 
presents certain difficulties of a purely semantic nature, those 
who examined the data felt that quite similar portraits of 
students were presented by the teachers and by the inter- 
preters of the questionnaire. Specifically, it was estimated 
that the personality sketches of 27 of the 33 students bore a 
remarkable similarity to the teachers’ descriptions. In some 
cases the questionnaire revealed traits which would seem to 
be completely unrelated to interests as usually conceived. 
These results were sufficiently encouraging to justify using 
the interest questionnaire approach and exploring it further 
as a possible means of appraising personal and social ad- 
justment. 


2. Significance of interests 

The approach taken was directly dependent upon the point 
of view held as to the significance of interests. This point 
of view differed somewhat from earlier and other current 
concepts of interests. 

In the present study interests were approached from the 
point of view of the relationship between the individual and 
the reaction or interest. It was thought that unless we are to 
consider interests to be merely chance reactions, arbitrary 
and capricious, psychological fungi as it were, playing no 
part in the fundamental body of the individual's character, 
We must assume that they are a result of the interaction of 
deeper desires with environmental forces. Interest then takes 
on the significance of an index of emotional tendencies and 
of the personality pattern of the individual. It becomes the 
expression of the aims of the individual, conscious and ex- 
pressed, or unconscious and to be inferred. Liking and dis- 
liking, accepting and rejecting activities, become significant 
as expressions of some of the basic elements and drives 
Within the individual. For the purposes of this study specific 
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interests in themselves become rather insignificant; the em- 
phasis is no longer on the desirability of interest within a 
certain field, but rather on the significance of interest for the 
inference of underlying urges and aims. Furthermore, in- 
terests were not thought of, in relation to this problem, as 
discrete, separable entities, but as interrelated and inter- 
acting. 

Those who can accept this point of view about the signifi- 
cance of interests can readily see how an interest inventory 
can be used as a projective technique, as “a means of dis- 
covering the way in which an individual personality or- 
ganizes experience, in order to disclose or at least gain insight 
into the individual's private world of meanin gs, significances, 
patterns, and feelings.”® The Interest Questionnaire offers to 
the individual the opportunity to reveal his wav of organiz- 
ing experience by presenting him with a large number of 
activities from different areas to which he reacts emotion- 
ally, in terms of like, dislike, and indifferent. 


3. Discussion of the Significance of Like, Indifferent, 
and Dislike Responses 


The exploratory study and interviews with students 
showed that certain inferences may be drawn from the types 


of responses which the student gives to the questionnaire. It 
was possible to do this partly on theoretical grounds, and 
partly because the examiners of the students responses 
trained themselves to seek in the data every possible clue to 
the emotional state of the subjects. Thus, it was found that 
“like,” “indifferent,” and “dislike,” may not be taken as mean- 
ing “just” like, indifferent, dislike, but may be thought of as 
having much more affective significance. “Like” may mean, 
for instance, “Is strongly attracted by it, loves.” “Indifferent” 


may mean either no affect, or withdrawal or repression of 
^L. K. Frank, "Projective Methods for 


the St ity,” The 
Journal of Psychology, 1939, p. 402. ы: Bessant 
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affect, or an avoidance of expressing an affect. “Dislike” may 
express active antagonism, fear, resentment. Thus, for in- 
stance, it seemed reasonable to assume that a student who 
expresses a “dislike” response to a great many school activi- 
ties does not "just happen" not to enjoy a large number of 
the listed activities but, perhaps, reveals an undercurrent of 


general antagonism to school. 


DESCRIPTION OF THE QUESTIONNAIRE 


considerations and the results of the ex- 
d as a next step the extension and 
tory technique. This led to 


ne preliminary 
P'oratory studies suggeste 


elaboration of the interest inven 
he construction of three inventories: Interest Index 8.2a, de- 


scribed in Chapter V, and Interests and Activities 8.2b and 


9 " : < ; 
с. Each of these inventories consists of 200 items to which 


Students respond by: like, indifferent, or dislike. Interest 


Index 8.2a consists of items relating to school studies and 
ал] subjects, whereas Interests and еен 8.2b en 
М6 i j i ith non-academic activities. It 
Was ght dem ires dealing with the intel- 
ght that three questionna g 0 
€ctual, esthetic, social, and inner mental and emotional 
areas of functioning ought to give a rather comprehensive 
Picture of the organization of the energies of the individual. 
t was further assumed that the above areas are intimately 


Mterrelated and that if attention is focussed on the inter- 
an on the examination of them 


nat 
ction among them rather tha ‘ 

аз separable units, one ought to be able to infer a great deal 

"egarding the functioning of the individual. 


Method of Gathering Material for t 

In order to make certain that the questionnaires contained 
Material taken from life situations of the students, leads for 
the choice of the items were obtained from children. A class 


Junior higheschool students, known rather well by one of 
© investigators, was told that information on children's 


he Questionnaires 
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interests would be of use to educators, writers of radio pro- 
grams, publishers of children’s books, etc. They were asked 
how they would go about discovering such interests. After a 
study of the problem, the class arrived at the following 
methods of studying children’s interests: (1) a carefully 
drawn up but informally administered questionnaire; (2) 
diary records, which were to include all activities engaged 
in by the members of the group, with comments as to how 
they had felt about them; and (3) a survey of the group as 
to what things its members wanted most to do or to have. 
The questionnaire contained such questions as: “What things 
do you like to do most when you are alone?” “What things 
do you like to do with others?” “What do you like pretend- 
ing?” “What do you like to do when you feel happy?” “What 
do you like to do when you feel sad?” etc. The questionnaire, 
diary, and survey yielded a large variety of activities which 
formed the basis for the choice of items. As far as possible, 
the original phraseology of the children’s statements was 
kept. Later a similar study was conducted in another city 
with a group of high school students; the resemblance be- 
tween the two activity lists was striking. 


Criteria for Selection of Items 


In selecting items for the questionnaires, three criteria 
were kept in mind: (1) that the item represent a fairly char- 
acteristic or common activity of children, (2) that the ac- 
tivity seem to belong to one of the clust 
activities which were thought to be related to personal and 
social adjustment, and (3) that the activity listed be not too 
threatening. In general, there was no effort to find single 
crucial items which would be diagnostic in and by them- 
selves. Doing so would be contrary to the whole philosophy 
of study of personality as it has been outlined in the preced- 
ing discussion. In a sense, each item in a category may be 


ers or categories of 


стт 
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said to be significant only as it is viewed as a part of the total 
configuration of responses. 


Discussion of Categories in 8.2b and 8.2c* 

Since there seems to be no generally accepted frame of 
reference in terms of which a personality should be studied, 
the selection of categories was made in terms of the thinking 
of the investigators regarding some of the more important 
factors which need to be considered in a study of a person's 
adjustments. Since a possible approach toward the evaluation 
of adjustment was thought of as a systematic study of the 
individual's ways of making adjustments, rather than as an 
appraisal of whether or not he is ^well adjusted," no cate- 
gories were designed to be indicative of "good" or "poor" 
adjustment in and by themselves. Each category was thought 
of in the light of the possible meaning it might have when 
examined in relation to other categories. This must be borne 
in mind when examining the categories. 

An effort was made to choose categories which so far as 
possible would yield information relative to the various kinds 
of adjustments the individual has to make. It should be noted 
that all of the information necessary for the description of 
an individual’s adjustment cannot be obtained from the 
questionnaire. Information as to the environmental factors, 
the individual's past history, and so forth, must be obtained 
in some other way. The present technique aims largely at 
tracing some of the subjective feelings of an individual and 
at making inferences from these regarding the organization 
of his personality. 

It will be seen later from the discussion of interpretation 
and from the sample case analysis that each student, without 
knowing that he is doing so, determines himself the organiza- 
tion of the categories by means of his reactions to the items. 

° The activities'listed in the questionnaires are not grouped by categories; 


the keyed list of items can be obtained in mimeographed form from Pro- 
Bressive Education Association, University of Chicago, Chicago, Ill. 
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Depending on his responses, any of the categories may come 
into a dominant position in the interpretation or may come 
to be regarded as of minor importance in his particular case. 
Thus, interpretations take their lead from the student and his 
way of responding. 

Nevertheless, in order to facilitate the exposition of the 
thinking of the investigators, in the following presentation 
the categories are grouped into three major areas: (1) “Or- 
ganization of impulses and drives” encompasses categories 
which shed light predominantly on the way in which an 
individual handles some of his impulses; (2) “Human rela- 
tionships” lists categories which are meant to tap predom- 
inantly the feelings of the student regarding social interac- 
tion of various types; (3) “Fantasy life” contains categories 
which are meant to reveal predominantly the extent and type 
of fantasies in which a student engages or which he avoids. 

It should be emphasized that the above three areas are not 
thought of as discrete and separate entities. This classifica- 
tion is merely a method of organizing certain emotional dis- 
positions which are in constant interaction. It should also be 
remembered that depending on the configuration, the same 
category may have different meanings. Furthermore, any 
one meaning attached to one of the categories is apt to influ- 
ence the significance of some of the other categories. 


1. “Organization of Impulses and Drives” 


а and b. Acceptance of Own Impulses and Severity 
with Oneself 


Those working on the construction of this instrument felt 
that one of the most fundamental problems with which every 
growing child has to cope is the reconciliation of his primi- 
tive drives and impulses with the restrictions which social 
living and social mores impose on him. As has been stated 
earlier in the formulation of the definition of adjustment, the 
desirable pattern was thought of as a des, balange be- 
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tween acceptance of the primitive impulses on the one hand 
and, on the other hand, considerations of social expedience 
and actual incorporation into the individual’s personality of 
some of the standards and restricting concepts of the social 
milieu. Difficulties in achieving such a balance are very com- 
mon. These difficulties may be said to fall into two broad 
categories. The first evidences itself in a personality which 
continues to operate primarily on the basis of its primitive 
impulses and urges, and disregards or fails to incorporate 
the social standards and taboos. The second type of difficulty 
may express itself in a too rigorous repression of the im- 
pulses and their gratification and may result in a truly in- 
hibited, extremely self-censoring and “over-restricted” per- 
sonality, { 

Categories entitled “Acceptance of Own Impulses” and 
“Severity with Oneself” attempt to bring to light the stu- 
dent’s status among his classmates with reference to the 
above areas of adjustment. In a sense, both of these cate- 
gories aim to appraise the same area of adjustment, but ap- 
proach it from two opposite poles. Thus, a very high score 
on “Severity” would tend to indicate that at least in certain 
respects the student’s “Acceptance of Own Impulses” is 
under actual or potential censorship. A very low score on 
“Severity” would tend to suggest that “Acceptance of Own 
Impulses” functions with considerable freedom. 

Examples from the category “Acceptance of Own Im- 
pulses” are: being a little sick and staying in bed all day; 
eating so much I can’t take another bite; saying whatever 
comes into my head. 

Examples from the category “Severity with Oneself” are: 
Setting myself tasks to strengthen my will power; working 
on myself, improving myself in some way; taking a cold 
shower on a winter morning. 
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c. Preoccupation with Cleanliness 


Early training in cleanliness usually represents the first 
demand which the social mores make upon the child to regu- 
late his impulses. This training is often accomplished by 
building up strong feelings of shame or guilt about bodily 
functions and the body itself. Various feelings of shame and 
guilt, conscious or unconscious, may result in undue preoc- 
cupation with cleanliness, purity, fear of contamination, fear 
of germs, etc. This type of anxiety seems to be particularly 
common in our society. This category is designed to furnish 
indications as to the extent to which, and the way in which, 
the individual has accepted and incorporated into himself 
this early experience. Thus, very low likes and high dislikes 
in this area might indicate a lack of acceptance of these 
demands of society, whereas, on the other hand, very high 


likes and low dislikes might be symptomatic of other ten- 
sions in this area. 


d. Methodical 


The child's attempts to master his impulses may result in 
a certain rigidity of personality with a tendency to compul- 
Sive behaviors. Most of the activities in the methodical cate- 
gory are quite common behaviors, behaviors which are usu- 
ally even encouraged by educators. They are activities which 
are characteristically rigidly patterned and repetitive; they 
also are activities which involve collecting, arranging, classi- 
fying, etc. Examples of the activities listed in this category 
are: copying papers to make them neat; keeping a calendar 


or notebook of the things I plan to do; making up catalogs 
and card files, 


€. Aggression 


Making the large number of ad 
child has to make, endurin 
his impulses 


justments which every 
g frustrations, having to inhibit 
» Invariably and quite normally produces and 
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contributes to the reservoir of stored hostility within the 
child. The expression of this hostility may take the form of 
overtly a-social acts; more frequently, however, it takes a 
more or less socially acceptable form, which serves as an 
outlet for the hostile feelings, without seriously imperiling 
the person. Categories entitled Aggression in 8.2b and in 
8.2с are composed of activities through which hostile im- 
pulses frequently find an outlet. Some of these involve overt 
acts, such as: hitting someone who has annoyed me very 
much, always telling people the truth even when it might 
hurt their feelings, picking someone's argument to pieces; 
others involve thinking: thinking of what I'll do when I grow 
up to people who have been mean to me, looking at pictures 
of death and destruction. 


2. “Human relationships” 
f. Relationship with Family 


Items dealing with activities commonly carried on in and 
with the familv were selected for the drawing of inferences 
about the extent to which the student enjoys, is indifferent 
to, or does not enjoy his home life. An effort was made to 
have a wide spread of activities, ranging from such activities 
as having a good argument or serious discussion with the 
family to cleaning up after meals, washing or drying dishes, 


g. Relationship with the Same Sex 


This category is composed of activities in which usually 
only students of the same sex participate. It was thought 
that liking or disliking such activities as belonging to a boys’ 
club or girls’ club, staying overnight at a friend's house, etc., 
might be indicative of a student's feelings, particularly when 
reactions to these activities are seen as part of a whole set of 
reactions in the area of human relationships. 


h. Relationship with the Opposite Sex 
The items in this category were so selected that a high 
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score in liking them would indicate a person who attaches a 
value to activities requiring the participation of both sexes. 
This category may be broken down into: 


1. Ordinary activities with the opposite sex, such 
as parties, dancing, etc. 

2. Activities implying a stronger interest in the op- 
posite sex than the above—making oneself at- 
tractive, courtship, etc. 

8. Activities indicating a less openly displayed or 
perhaps vicarious interest in the opposite sex— 
such as reading love novels, watching others 
who are in love, day-dreaming about it, etc. 


i Identification with Others 


The purpose of this category is to investigate the extent to 
which a student likes, or likes to think of himself as liking, 
activities which involve a Strong personal interest in other 
people, close, intimate friendships, sympathetic taking care 
of others, defending the molested, etc. Many of these items 
are concerned with imagining things about other people, or 
about one's relationship with other people, rather than with 
actually doing things. Thus, it is possible that a student who 
has not yet actually established successful social relations 
may still like these activities. This category is designed, then, 
to show the extent to which the student has a value for such 
relationships. Characteristic items are: having a lot of close 
friends with whom I can talk about anything; trying to find 
out what a quiet shy person is really like; discussing with 
younger boys or girls what they like to do and how they feel 


about things. 
ј. School Activities 


This category is designed to reveal the student's attitudes 
toward student organizations, the school, school life, etc. It is 
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composed of activities commonly carried on in school, such 
as: being an active member of a school club, being on class 
committees, going to school dances, etc. 


k. Out-of-School Activities 


This category summarizes all the activities which might 
reveal participation and interest in social life outside of the 
school situation. When considered in relation to the category 
school activities, it may reveal whether the student is gen- 
erally sociable and enjoys all types of social situations, is 
generally a-sociable, or sociable in school situations but not 
in out-of-school situations or vice versa. 


1. Solitary 


This category is composed of activities in which one usu- 
ally engages alone, such as keeping a diary, playing solitaire, 
etc. It also lists some activities which are usually sociable 
but are designated as solitary, such as: eating alone, going 
swimming, skating, bike-riding alone, etc. 


m. Impressing Others 


This category is composed of activities which involve pre- 
Occupation with personal appearance, desire to be unique, 
outstanding, in the limelight. The following items are repre- 
sentative: making my handwriting unusual and decorative; 
having the reputation of being different or unusual; starting 
a fashion or a fad. 


n. Leadership 
Activities which involve organizing others into groups, di- 
recting groups, debating, arguing, etc., are sampled in this 
category, Examples of these activities are: organizing com- 
Mittees to plan various school affairs; being in public speak- 
ing or debating contests; ete. 
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o. Reactions to Authority 


Activities listed in this category involve either submission 
to or rebellion against authority. Statements are so coded 
that a high score in likes is indicative of a submissive attitude 
toward authority, whereas a high score in dislikes is indica- 
tive of a rebellious or antagonistic attitude. Typical items 
are: writing papers on definite, assigned topics rather than 
having a free choice; being in a group where one person 
takes the responsibility and decides what people should or 
should not do. 


3. "Fantasy life” 
p. Birth—Life—Death 


Activities in this category involve wondering about the 
meaning of life and death, thoughts about the origin and end 
of things, the meaning of eternity, and the stability and 
permanence of the universe. Preoccupation here might indi- 
cate the need to externalize personal anxieties and put them 
on a cosmic scale. Conflicts one cannot face near at hand are 
often projected into the cosmos, and dealt with in a philo- 
Sophical way. Examples of items are: finding out how things 
got started; thinking about what might be the end of the 


world; imagining what would happen if gravity ceased to 
exist. 


q. Fantasy 
Although it is important to recognize that fantasy can play 
ment mechanism and therefore in itself is 


not an indication of maladjustment, it is also true that indi- 
viduals who have difficulties in co 


this mechanism as a Substitute fo 
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with someone I like or admire; imagining how it would feel 
to be rich and famous. 


r. Mystery 


A child who has been unduly sheltered and kept away 
from the realities of the life around him may develop not 
only distorted notions regarding his environment, but also 
reat curiosity and preoccupation with “the secrets of adults” 
and other mysteries. The items in this category attempt to 
sample the ‘different “mystery-interests” of children and 
adolescents. Such statements as the following are found in 
the questionnaire: having people “forget themselves” and 
talk freely; listening to other people’s phone conversations. 


s. Magic 


Every child, at least in part because of his relative incom- 
petence as compared with adults, in his efforts to deal with 
his environment tends to resort to magical means, such as 
good luck charms, avoidance of symbols of bad luck, etc. 
Great dependence upon these symbols may reveal a feeling 
of incompetence and a need to resort to “superior powers” 
for help. This category lists some of the activities which in- 
volve using magic, such as: carrying a good luck charm; 
making up little games or schemes which will bring luck if 
they come out right; seeing if a hoped for thing comes true 
if I concentrate on it. 


t. Dramatics 


This category is composed of theater arts activities—those 
involving writing and production of plays, and those in- 
volving taking specific roles. It can be interpreted both as 
revealing interest or lack of interest in the theater arts per 
se, and it can also be interpreted as revealing the wish or 
fantasy life of the individual. In this connection an examina- 
tion of the types of roles which are preferred is particularly 
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interesting. Examples of items are: thinking up plots for 
plays; taking the part of a wicked or dangerous person in a 
play. 

u. Humor 


This category is composed of activities which have to do 
with the appreciation or expression of humor. Humor may 
be thought of as a way of relieving tension. It also is fre- 
quently an accepted, subtle way of expressing hostility. This 
is particularly clear in playing practical jokes and other such 
forms of humor. The items in this category also serve to make 
the whole questionnaire lighter in tone and more entertain- 
ing. Examples are: drawing cartoons; seeing plays which are 
“take-offs” on dignified people or institutions; reading or 
writing funny poems or limericks. 


INTERPRETATION OF THE RESPONSES TO THE QUESTIONNAIRES 


The questionnaires are scored in terms of the per cent of 
the items in each category to which the student responds 
with like, and the per cent to which he responds with dislike. 
The per cent of indifferent responses may be readily calcu- 
lated by subtracting the sum of the above two scores from 
100. As will be seen presently, the interpretations may be 
made on two levels. For a quick overview of the student’s 
interests and adjustive trends, one may examine his tabulated 
per cent scores in the various categories on the Summary 
Sheet. This takes little time and gives a fair but rather gen- 
eral picture. A much more detailed study of the student may 
be made from the examination of his 


specific responses to 
individual items in the questionnaires, 


Interpretation of the Scores on the Summary Sheet 


A student’s score on a category acquires meaning in two 


ways: when viewed in reference to the group median, and 
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when viewed in relation to the student’s scores on other 
categories. 

No scores are considered high or low per se. The student's 
scores are always examined in the light of the scores of other 
students in the group in which he is living and working. It 
is possible, however, to single out categories in which he 
ranks high or low in his group in likes or dislikes. From exam- 
ining these categories it is possible to draw certain inferences 
about the student. For instance, it frequently happens that 
a student has low likes on the academic interest question- 
naire (8.2a), but has high likes on all the sociable categories 
in the non-academic interest questionnaires, or vice versa. A 
student's likes and dislikes may group themselves not only 
in this broad manner but may also group themselves in 
greater specificity. One may find, for instance, a student who 
is high in likes in categories involving precision in work, such 
as physical science, mathematics, industrial arts, and method- 
ical, whereas he may be low in likes in categories involving 
greater freedom of action and self-expression, such as fine 
arts and dramatics. Again a student might be low in liking 
Such sociable activities as are listed in the categories same 
Sex, opposite sex, sociable activities in school, and sociable 
activities out of school, and at the same time be high in lik- . 
ing fantasy, mystery, magic, etc. Many different configura- 
lions are thus possible. Ў 

The final picture is derived from the way in which the 
individual student reacts to a great many fields of activity: 
academic interests, sociable activities, and activities which 
indicate his attitude toward himself. In linking these seem- 
ingly quite different fields, the interpreter attempts to discover 
the common elements which make the student's response to 
academic situations understandable in terms of the way in 
Which his personality is organized. The fact that the mean- 
ing of a given score in a category may change with the re- 
sponse of the student to other categories 1з an important 
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consideration. For instance, if a student responds to leader- 
ship by liking 80 per cent of the items and also comes out 
high on fantasy, but comes out low on most of the categories 
dealing with sociable activities, one is justified in raising the 
question as to whether or not the high liking of leadership 
indicates wishful thinking. Careful study of results thus far 
has indicated that if one watches for the inner consistency of 
the picture presented by a student, one learns to discover 
facts about his fantasy life and learns to single out his wish- 
ful responses. The fact that with some students the question- 
naires are apt to reflect their wishes rather than represent 
their actual behaviors is an important one and should not be 
regarded as something which makes this technique invalid. 
On the contrary, is not this gaining of insight into the inner 
mental life of the child the most difficult but important part 
of the problem? 

It frequently happens that a student's category scores are 
generally high or low in likes, indifference, or dislikes; i.e., 
one finds students who are “high likers,” “low likers,” “highly 
indifferent" —high or low, that is, in relation to the group 
medians. When there is a general tendency to respond in a 
certain way, deviations from this tendency become impor- 
tant, even though the deviations may not be apparent at 
first. If, for instance, a student is below the group median in 
likes in all categories, but near the median in some cate- 
gories, and at the same time is one of the lowest in the class 
in his scores in likes on other categories it becomes evident 
that his scale has a smaller area, but that there still is a dif- 
ferentiation in his response, 

In each case it is necessary to examine all three scores, like, 
indifferent, and dislike. A student may have an equal like 
Score on two categories, but the fact that he feels differently 
about the activities in each may be evidenced by a strong 
dissimilarity in his dislike scores, j 


Generally, the process of interpreting the summary sheet 


p 


е 
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is as follows: The interpreter first picks out the highest likes 
in relation to the student’s other scores, and attempts to seek 
common elements in these categories. The same is done for 
the high dislikes and high indifferences. This examination 
includes also a consideration of the categories which the stu- 
dent likes or dislikes least. 


Interpretation of Responses to Individual Items 

Although this approach to personality study attempts to 
procure quantitative data on emotional tendencies and dis- 
positions and seems to do so rather successfully, for a deeper 
understanding of a student a more detailed analysis is neces- 
sary. This is done by an examination of his responses to indi- 
vidual items and is a procedure which is particularly impor- 
tant for gaining an understanding of the dynamics of the 
student’s behavior. Here again the same main principle of 
interpretation as is used with the category scores is applied. 
First the likes, then the dislikes, and then the indifferences 
for individual items in each category are taken and each 
time an attempt is made to single out the common elements 
Which characterize or run through the given group of activi- 
ties. This examination frequently reveals new categorizations 
peculiar to the individual whose responses are being exam- 
ined. For instance, two students may have very similar scores 
in the total number of likes on the category opposite sex; one 
may like only the items concerned with actual sociable ac- 
tivities; the other, however, may like only those items show- 
ing a vicarious interest, those involving fantasying, reading 
romantic novels, etc., and be indifferent to or dislike the 
actual sociable activities. In the same manner, one may ob- 
Serve that a student may consistently like or dislike all the 
items involving speaking before a group, regardless of 
whether the activity appears in a foreign language class, 
mathematics, or in a social situation. Many such individual 


categorizations have been traced. 
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With regard to the study of specific responses within a 
category, it must be borne in mind that the meaning of any 
given response to any item must be again examined in a 
twofold way. First it must be examined from the point of 
view of the particular pattern which it reveals for the stu- 
dent; i.e., from the point of view of the types of activities 
within a given category or in different categories that the 
student likes, dislikes, and is indifferent to. Second, these 
specific responses must be examined against the background 
of the responses of the same age and sex group. To make 
such a comparison possible the staff is preparing a table of 
responses of students to every item in the questionnaires. 
These tables are based on a study of responses of a large 
number of students and will show how the boys and girls of 
different school grades have distributed their likes, indiffer- 
ences, and dislikes. Thus, for the evaluation of the meaning 
of a specific response it is important to know that less than 
10 per cent of both boys and girls of all grades from seven 
to twelve mark “dislike” the item: “Talking in halls and 
locker rooms.” The significance of a student's specific re- 
sponses obviously changes depending on how the majority 
of his age mates respond to the item. 


Discussion of Total Scores on 8.22, b, and ст 

We note that whereas the bulk of Lyle's interests on the 
academic interest questionnaire (8.2a 
ter, in the non-academic questionnaires (8.2b and с) none 
of his likes on any of the categories is high enough to place 
him in the upper quarter of his class, On all except one 
category in 8.2a he shows zero dislikes. The only category in 


- It begins to look as if 


) are in the upper quar- 


.* This description was made from the mat, 
tionnaires abov. 


erial obtained from the ques- 
until the сотр 


were made and held by them 


SAMPLE ANALYSIS OF RESPONSES OF ONE STUDENT 


TABLE I 
Scores of One Student and Medians of his Class on Three Interest Questionnaires 
Lyle O., Age 12 years, 6 months 


Mid-Western Private School ese NGS 
7th Grade, Class of 22 boys 


| 
Likes Dislikes | Indifferent 
Category | | 
Э Rank Per . | Rank | Per Me- Per i: 
in | Cent | Me | in | Cent | Me | Cent | Me 
Group | Scores tan | Group | Scores Scores | ап 
8.2a, ui Boys 
ррег Quarter 
ine Arts. 1 88 38 0 25 
usic... 1 75 14 0 58 
Manipulative. | 2 76 50 0 20 
ndustrial Arts. 2-5 94 75 0 H 
Mathematics 3 75 20 0 38 
usiness 3 69 31 0 23 
3 63 36 1 28 36 36 
344 63 26 0 42 
3 63 22 0 37 
Reading....... 3-4 54 33 0 22 
Physical Science 4 58 48 0 14 
2nd Quarter 
i 6 63 30 0 23 
" 6 46 23 0 24 
uarter 
Токе Оте Economics...| 12-14 | 19 26 о 26 81 48 
Ower Quarter z 
5 31 50 9-11 19 17 50 33 
63 52 |12-15 0 19 37 29 
66 58 | 10-11) 14 12 
25 25 |13-15| 19 27 56 48 
49 49 17 6 22 45 29 
22 17 | 18-19 | 22 52 56 31 
38 38 14-17 6 25 56 37 
36 37 21 8 35 56 28 
24 30 15 32 38 44 32 
37 | 12-13] 2 31 40 32 
19 з1 |1920| 19 38 62 31 
22 31 18 12 24 бб 45 
30 38 | 18-19 | 13 24 57 38 
18 32 15 20 30 62 38 
26 39 18 12 16 62 45 
17 18 48 50 22 
TIERE 3|2|g | x 
HEressi 7-1 14 42 5 
Leadership © 1| a 60 19 13 58 27 
Total (Sik sie ШЗ) 28 39 21 31 51 30 
lentificati ий 
Os. | ы | dg | E 25 | 25 | sz | as 
Out-of-Sch. 8 44 0 15 92 4 
Same 0 43 10 10 90 47 
S 22 38 28 27 50 35 
5 34 19 16 38 76 28 


Figures falling in the upper quarter in the indifferent column are italicized. 
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with other people and dealing with certain environmental 
realities. (Incidentally, having zero dislikes for any of the 
academic activities is most unusual.) 

Whereas Lyle is above the median in indifference in prac- 
tically every category on 8.2b and c, he is above the median 
in indifference in only two categories on 8.2a—home eco- 
nomics and sports. This seems to point to some very impor- 
tant differentiations in the organization of his energies. 
Probably his indifferences in 8.2а should be examined 
separately as they may be equivalent to dislikes in his case. 
Evidently, for some reason which we do not know, Lyle has 
a value for the *academic"—and either accepts or feels he 
should accept everything which seems to fall into this classi- 
fication. 

On 8.2b and 8.2c, it is interesting to note that on his scale 
of interests fantasy is highest, whereas all of the categories 
involving interaction with other people (with the sole excep- 
tion of family) fall below the median. It would seem, again, 
that he distinguishes in some way between activities with 
other people and the things that go on in his mind. 

Lyle has only three high dislikes on 8.2b and 8.2c: aggres- 
sion (b), and aggression (c) and leadership. We see in this 
a strong avoidance of asserting himself, openly, with other 
people. Furthermore his high indifference in the category 
authority, coupled with the very low dislike and what is, on 
his scale, a fairly high like of it, make us feel that he is a boy 
who has accepted a certain set of adult standards and avoids 
expressing any criticism or questioning of it. In a sense he 
seems to be a boy who is pretty thoroughly subjugated by 
the world of adults. It is startling to note that Lyle has zero 
likes in the category dealing with activities with the same 
Sex, and has only 8 per cent likes in the category dealing 
with sociable activities out-of-school. This is very unusual 
for a seventh grade boy or any boy for that matter. Actually, 
he shows on these questionnaires a slightly higher interest in 
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the opposite sex than in the same sex. Usually this is re- 
versed among seventh grade boys. However, his interest in 
the opposite sex is not high enough so that it could be called 
an outlet for his sociable feelings. It would rather seem that 
he does not avoid it to the extent that he does the same sex. 
(An examination of Lyle’s specific reactions in this category 
reveals that he likes only three items. These items are only 
remotely connected with this category and do not involve 
any activities with the opposite sex—they deal rather with 
learning facts and with daydreaming.) 

Lyle’s low likes in the category solitary seem contradictory 
to the picture we have been getting of him. In his case, how- 
ever, we tend to think that this low score is an indication of 
a tendency in him to avoid admitting to himself (or to 
others) that he does not have a normal play-life with other 
boys and girls. If this hunch is correct then we may say that 
Lyle may, in himself, have a value for or feel a lack of satis- 
faction in the sociable area, but that the full realization of 
the fact that he misses something in life is too painful for him 
and he attempts to convince himself that he is really indif- 
ferent to it. 

Discussion of Reactions to Specific Items on 8.2a 

Since Lyle has dislikes only in the category sports a de- 
tailed examination of his responses in this category may be 
fruitful. Such an examination reveals that he dislikes: to play 
baseball, to play basketball, and to do setting-up exercises. 
The strength of these dislikes is particularly impressive when 
We recall that they are the only items which he so marked 
on the whole questionnaire. We notice further that he is in- 
different to all the team games. He likes only such highly 
individualized sports as: to play horseshoes, to shoot with 
bow and arrow, to play golf, etc. 

In social studies we notice that Lyle is indifferent to all 
“social action” items, such as: taking part in a campaign 
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against countries or business firms which treat people un- 
justly; attending public meetings to protest against some- 
thing you regard as unfair; getting people to vote for certain 
candidates, etc. On the other hand, he likes those items 
which deal with study, reading, and history. His interest in 
social studies seems to be largely an academic one. 


Discussion of Responses to Individual Items on 8.2b and 8.2с 

We may take first categories on which Lyle expresses high 
dislikes (for him). 

Leadership. In this category Lyle is indifferent to almost 
all the items except that he likes to speak at a club or class 
meeting, and likes organizing a hobby club. Both of these 
are explainable in terms of his interest in academic pursuits. 
He dislikes: organizing groups to vote in a certain way in 
school elections, organizing a protest meeting in or out of 


school (cf., social studies), and being captain of an athletic 
team. This latter item is disliked by or! 


ly two other boys in 
the whole class. 

Aggression. In this category Lyle dislikes such aggression 
as: throwing spit balls, throwing things when I am mad, 
playing a joke on a teacher (disliked by only three other 
boys), picking a fight when I am in the mood, and telling 
someone what I think of him. He has only five likes out of a 
total of 33 items in this category, and these likes are distin- 
guished by the fact that again they are not open expressions; 
in fact, they seem to represent what may be called fantasying 
about his aggressions. He likes: thinking of what I'll do when 
I grow up to people who have been mean to me, checking 
up on things that teachers say in order to find out if they are 
true or not, reading about real crimes and how criminals get 
caught, and thinking about how to become the cleverest, 
richest, hardest financial genius in the world. 

Authority. The striking thing here is that Lyle is very in- 
different to authority. His very indifference seems to indicate 
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a certain submissiveness. We notice that he likes: having a 
teacher lead and supervise a free-time activity, having a 
teacher outline in detail what should be studied and how to 
go about it, and being on a committee where the chairman 
makes the decisions instead of allowing a lot of discussion 
(he is the only boy in the group who likes this item). We 
draw from this the inference that Lyle is happiest in a 
teacher-controlled situation, and that for some reason or other 
pupil-controlled or pupil-dominated situations contain some 
Sort of threat to him. 

The avoidance of asserting himself in leadership and ag- 
Bression and his apparent liking of following adult authority 
and avoidance of interaction with other youngsters, makes 
us think that the hostility which he must have toward his 
group must be expressed through isolation from the group 
rather than through open conflict, except perhaps in a very 
Spotty and spasmodic way. This isolation from the group is 
probably expressed in his fantasy activities and also by using 
his intellectual interests as a way of achieving superiority (in 
his own mind) over other youngsters. We consider that he 
has adopted too early the adult-approved pattern, without 
having gone through the necessary stages of really arriving 
at it. This, we tend ‘to believe, has fixated him on an emo- 
tionally immature level of development. It is interesting to 
note that he likes: having people take me for older than I 
am, discussing things with older people, etc. The world of 
adults seems to threaten him much less than the world of 
other youngsters. 

This interest in older people is in striking contrast to his 
Seeming lack of warm, intimate, friendly interest in his own 
age-group. We notice for instance, that Lyle is the only one 
in his group who dislikes: trying to find out what a quiet, 
shy person is really like, standing up against a group and 
defending a person who has been picked on, etc. Such re- 
Sponses make us think that he is probably essentially very 
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shy himself. We tend to feel that while his constellation of 
academic interests may seem “mature,” there is a great de- 
pendence upon adults. Thus he seems to fear those situations 
in which he is unprotected. We notice, for instance, that he 
dislikes: talking to strangers, taking a long trip all alone, 
having my parents go off on a long trip, ete. This again seems 
to point to that odd combination of adultish and infantile 
qualities in Lyle upon which we have remarked before. 

In connection with this we note that Lyle is in every in- 
stance indifferent to items which are concerned with per- 
sonal appearance. There are only two categories to which he 
is more indifferent than he is to preoccupation with cleanli- 
ness—out-of-school activities, and same sex. 

Some general comment should be made about the possible 
meaning of Lyle’s indifferences. We are inclined to interpret 
them in two ways: in part, they seem to represent a with- 
drawal of his energies from the sociable areas and throwing 
them into the academic area; in part, they may be an escape 
or protection from the reality situation. The very great in- 
difference (over 60 per cent) in such categories as same sex, 
out-of-school activities, school activities, and opposite sex is 
really very striking. We do not interpret this as meaning that 
Lyle does not have or never had any desire for social inter- 
action, but rather we interpret it as meaning that for some 
reason, and in some way, he finds such interaction difficult 
and disturbing. We tend to think that he would like to be 
able to get along with other people. He likes, for instance: 
carrying on imaginary conversations with someone whom I 
like or admire, imagining situations in which 1 might be а 
hero, planning long adventurous journeys, etc. (In connec- 
tion with this we note that he does not like the reality ver- 
sions of these statements—i.e., he does not like: trying to 
describe my innermost feelings to a friend, standi 
someone who has been picked on, takin 
etc.) Thus we see an important disc 


ng up for 
ga long trip all alone, 
repancy between his 
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fantasy life and his attitude toward his real life. We also 
notice a tendency to project a great many of these wishes or 
unsatisfied desires into the future—he likes for instance: 
planning my future family, daydreaming about the future, 
listening to fantastic plays about the future, and, on the other 
hand, imagining what I would do if I could live my life over 
again. 

In conclusion, one may say that Lyle probably does not 
get into open clashes with adults and is very likely to be 
academically a good student. His age-mates may elect him 
to class offices, but probably few of them, if any, accept him 
as a real member of the group. A number of youngsters are 
apt to be annoyed by him and make him the butt of their 
jokes. Lyle’s main difficulties seem to be that although or 
because he has accepted prematurely the standards and 
values of a certain group of adults—his own emotional 
development has been warped and arrested. 


Statements checked and written in by teachers who filled out the 
Descriptive Trait Profile 

Shy, retiring, academic minded boy. Likes science especially. Re- 

treats from all social functions. Adult in thinking and associa- 

tions. Brother so much older. Father and mother very brilliant. 

Lyle suffers from asthma and many allergies and heart weakness. 

Fear of death is strong. 

Observable propelling drives? For perfection and truth in scien- 

tific approach. Strong questioning mind—extremely modest— 

introvert, 

Vital, active, efficient, well-organized and concentrated in his at- 

tack on school work. 

In thinking through a problem tries within the range of his ability 

to obtain a wide range of facts and considers and weighs them 

Impartially before arriving at a conclusion. 

Outstanding interests: Science—impersonal scientific research, 

Anything but people. 
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Thought of as being only moderately boyish in dress, activities 
and interests, and physique. 


Average looking. Timid soul type. Not physically strong. Pleasant 
boy, however. 


Too secure with parents—and himself—not enough with boys his 
age—adultish in standards. 


Holds rigid standards for himself—very self-critical, 
Follower—and yet respected because he knows his stuff. 


Tendency toward daydreaming, fantasy—Lyle is an introvert— 
but in the scientific sense, s 


Ordinarily contented, satisfied, serene. Tends to make the best of 
situations even when they are unpleasant. 


Calm, composed, even, level-headed, well-balanced. Expresses 
his emotions freely and is not either uncontrolled or over- 
restrained. 


Generally flexible and adaptable; adjusts readily to new situa- 
tions, to changes in routine, etc. 

Self-confident in a calm way, estimates self fairly correctly, ac- 
cepts own assets and liabilities fairly realistically; is not over- 
modest nor has the need to brag. 


Is fairly well-poised, 


Shies away from students of the same sex, 
Is respected thou 
friendship is sou 
dents. 


gh not a prominent member of the group. His 
ght and he enjoys popularity and attracts stu- 


May not have any strong individual attachments, yet responds in 


a moderately friendly and interested way to the opposite sex. 


RELIABILITY 


The reliability of each category of scores on the two ques- 


the Kuder-Richardson formula 
1,000 students, divided evenly 
mong grades seven to twelve in 
Is. The results, along with the 
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range of scores, the mean, and the standard deviation on 
likes and dislikes in each category, are given in Tables in 
Appendix VI. In general, the coefficients of reliability range 
from .53 to .86, the median coefficient for likes being .78 
and for dislikes .75. Only three categories of likes and six of 
dislikes have a reliability coefficient lower than .70. While a 
higher degree of reliability would be desirable, considering 
the intrinsic variability of behavor in this area, the reliability 
of other tests in this field, and the way in which one score 
is continually checked against another, the obtained relia- 
bilities were considered sufficiently high for the purposes of 
these tests and for the manner in which they were inter- 
preted. 


VALIDITY 


The problem of validity of a technique of appraisal is one 
of paramount importance. It is a complex problem, however. 
On the long road at the start of which are the assumptions 
which underlie the technique and at the end of which are 
the final interpretations or descriptions of a subject, there 
are many points at which validity should be questioned and 
Scrutinized. As it has been stated above, the degree of effec- 
tiveness of the present method of study of personality was 
checked upon at the very beginning of the study when 33 
college students were described and these descriptions com- 
Pared with the school records of these students. Similar 
informal studies have been conducted as work progressed. 

hese studies helped in guiding the staff in its experimenta- 
tion with untried methods and suggested the abandonment 
9f certain ones which were not found fruitful. The following 
is a presentation of some of the findings on validity to date. 


Discussion of the Evidence on the 

alidity of the Questionnaires 

Broadly speaking, validity may be broken down into two 
Parts: (a) validity of the instrument as such, and (b) validity 
of the interpretation of the results. 
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Genuineness of response. One important element involved 
in the validity of any instrument of appraisal is the so-called 
genuineness of the response of the subject. By genuineness 
of response, in this instance, is meant the extent to which 
the response represents the real feelings of the individual. 
If, as may be the case in such an instrument, the response 
represents wishful thinking, it is nevertheless genuine, for 
the wishful thinking is an important part of the individual's 
feelings. It is possible to have genuineness of response with- 
out making valid interpretations of these responses, although 
it is difficult to see how the contrary might be true. 

One would naturally expect some fluctuation in category 
scores from year to year because of growth factors. If the 
response were not genuine one would expect marked and 
unpredictable fluctuations in category scores from year to 
year. One would be dealing with chance or random reac- 
tions. If, however, after having made allowance for the 
growth factor, there still is a fairly high relationship between 
the category scores one year, and the scores on a retest a year 
later, one might be justified in concluding that there is con- 
stancy, and therefore genuineness of response. The following 
table shows the results obtained when correlations were run 
between the category scores of 48 boys and 56 girls who 
responded to the questionnaires in the seventh, eighth, or 
ninth grades one year, and in the eighth, ninth, or tenth the 
next. 

These data seem to indicate that having made allowances 
for the growth factor there is still a high degree of con- 
sistency of response, and therefore of predictability. It would 
seem justifiable to assume that genuineness of response was 
a contributor to this constancy factor. 

In preparing the 
learn how students 
to determine this, t 
was placed the ite 


questionnaires it was felt important to 
feel about this approach. In an attempt 
oward the end of the third questionnaire 
m: "Answering questionnaires like this- 
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TABLE 2 
Product-Moment Correlations of Scores Obtained One Year Apart 


Category 48 Boys 56 Girls 
Meaden „желле жак otc .78 .48 
Кааб us semis .77 HT 
Life-Death .76 Т 
Identification with Others....... ‚74 .70 
Aggression. (c)... see veg ere oe n 712 .65 
Total (б. sas acs же tas ae ee нн .70 .81 
Self-acceptance. ............... .70 .70 
Total (b)..... .68 5 
Humor.... -68 ‚74 
Cleanliness. . . "m .68 .49 
Mystezy. „акеле я seg -66 .60 
Methodical. 4 i oes eeterr oie tes .65 .59 
Out-of-School. sson сез sea ss mees .62 „57 
Aggression (b) ........ eee .61 .61 
Dramatics ana .58 .69 
Non-Identification .58 .67 
PART: ue qur аген o lems Me d .58 .70 
Severity "T .57 .63 
Е, amily ‚85 .68 
Opposite Sex. ..... „38 ‚64 
Authority........ .46 .20 
Same Ѕех...... .40 .78 
Solitary.......... 44 «55 
School Activities 34 .68 


It may be seen from the following tabulation of responses to 
this item that girls in all grades enjoy the questionnaires 
more than the boys, that students in the lower grades like 
them more than the older students, that in most grades more 
'Students marked this item like than dislike and that only in 
the case of the tenth grade boys did as many as 41 per cent 


of them mark this item dislike. 


Discussion of the Evidence of 
alidity of Interpretations 
l. Validation through information from the school. During 
Пе course of the present study the questionnaires were ad- 


388 ADVENTURE IN AMERICAN EDUCATION 


TABLE 3 


Per Cent of Students Responding Like, Indifferent, and Dislike to “Answering 
Questionnaires Like This? 


" Per Cent of Boys Per Cent of Girls 
Number Responding Responding 
Grade | | 
| 
| Boys | Girs | L I D L| J D 
| | 
7 78 | 91 42 | 32 | 26 | 66 26 8 
8 60 | 50 47 30 23 78 8 14 
9 164 | 177 41 30 29 57 | 28 15 
10 97 |176 | 32 | 27 | 44 49 | 28 23 
11 114 | 200 | 42 | 24 | 34 | 43 | 23 | 34 
12 126 95 30 | 32 | 38 35 | 30 35 


ministered widely in a number of schools and in several of 
these schools the Evaluation Staff agreed to furnish written 
descriptions of some of the students’ personalities in order to 
check on the correctness of the interpretations derived from 
the questionnaires. The faculties in the schools selected the 
students for this study before the questionnaires had been 
administered. The only information on these selected stu- 
dents which the staff had was the name, age, grade, and sex 
of the student and the responses to the questionnaires; ОП 
the basis of this information a rather detailed description of 


the personality of each student was 
repared.3 
While the written descripti im; 


naires, the teachers wh 
5 The case which wa: 

x 5 presented i i ion i p 

studies. It was selected E iem tum m, bey eie 


2 st. 
because it was shorter than mos 
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to read these descriptions carefully and to make marginal 
notes, especially in instances when they disagreed with the 
picture presented. By this method 16 case studies were made. 
2. Method of appraising the extent of agreement and dis- 
agreement with the material submitted by the schools. Since 
the present approach to personality study is thought of essen- 
tially as a technique which aims to bring out some of the 
outstanding features of a personality, different patterns of 
Organization of energies of individuals, it was felt that the 
final validation should employ methods suitable for such ma- 
terial. This made it impossible to attempt to arrive at some 
Single index or coefficient which would represent the degree 
of validity of the interpretations. It was thought further that 
the problem of validation of descriptions of students derived 
from the interest questionnaires involves the examination of 
the cases from three angles. First, there must be an appraisal 
of the comprehensiveness of the description of the students, 
the extent to which the analysis brings out a number of sig- 
nificant facts about the student (significant from the point of 
view of the counselor and classroom teacher). Second, there 
must be an appraisal of the degree of consistency or incon- 
sistency between the interpretation of the questionnaire re- 
sults and the material presented by the school on the same 
students. Third, since the descriptions derived from the 
questionnaires at times attempted to go beyond what the 
classroom teacher might know about the student, a judgment 
had to be made regarding the reasonableness or probability 
that these inferences were valid in the light of all the informa- 
tion available on the student. The same judgment had to be 
made in cases when there was an actual disagreement be- 
tween the two descriptions; the teacher’s judgment could not 
© accepted as necessarily infallible any more than could 
that of the interpreters. 
Because noe of the simpler statistical methods could be 
used to measure the degree to which two pictures of a per- 
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sonality coincide or differ, or to determine which емее : 
more likely to be psychologically correct, it was м s n 
the opinions of a number of competent judges would on 
the best evaluation of this study. In other words, the criter А 
of enlightened common sense seemed to be the most feasible 
method of appraising the validity of the interpretation. " 

Sixteen judges were selected and they were asked to guide 
themselves by the following questions in making their judg- 
ments: (1) Would most reasonably competent people tend 
to agree or disagree that the same tendency or SAUBER d 
of the student was commented upon by the interpreters anc 
by the teachers, even though they may have described lw 
characteristic in different words and in a different context! 
(2) From my experience with children and adults, from m 
observations of human behavior and motivation and from al 
facts presented in this case, which of the two statements 
about the student seems more likely to be correct—the one 
made by the interpreters or the one submitted by the school? 

The judges were asked to use the following procedure in 
making their evaluation of this material: 


1. Read through the interpretations of the interest question- 
naires carefully, 
2. Read the comments of the teachers, marginal and other 


n . . "n + H 1 it 
Wise, including the information from the Descriptive Trai 
Profile. 


8. Make a statement regarding (a) the degree of compre- 

hensiveness of the picture of the student, (b) the degree 
of agreement between the interpretation and the dat? 
and finally, (c) in cases of disagreement, 
arding which of the two pictures seem | 
or valid in the light of all the informatio! 
gathered on the student, 

A list of statements w 
questions (a, b, and c)o 
judges were instructed t 
but to regard th 


as prepared for each of the ere 
n which judgment wàs sought. Th К 
О check the appropriate wr 
€se statements as merely suggestive and 
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feel free to make their own statements. The tabulation of 
statements checked or written in by the judges will be found 
on the following pages for all 16 cases. Since each case was 
judged by four judges the total number of judgments for 
each of the three questions should normally be 16 X 4 = 64. 
Because some judges checked more than one statement, the 
actual number of statements is often above 64. 


LIST OF JUDGES 


Peter Blos, Institute for the Study of Personality Development, 
New York City, 

J. F. Brown, Professor of Psychology, University of Kansas, 
Lawrence, Kansas. 

P. S. de Q. Cabot, Director, Cambridge-Somerville Youth Study, 
Cambridge, Massachusetts. 

Frank S. Freeman, Professor of Education, Cornell University, 
Ithaca, New York. 

Robert J. Havighurst, Professor of Education and Secretary of 
the Committee on Human Development, The University of 
Chicago, Chicago, Illinois. 

Josephine R. Hilgard, M.D., Fellow in Psychiatry, Institute for 
Juvenile Research, Chicago, Illinois. 

L. L. Jarvie, Director of Guidance and Curriculum, Rochester 
Athenaeum and Mechanics Institute, Rochester, New York, 

Harold E. Jones, Director, Institute of Child Welfare, University 
of California, Berkeley, California. 

Jean w, Macfarlane, Director of Child Guidance Study, Institute 
of Child Welfare, University of California, Berkeley, Cali- 
fornia, 

George J- Mohr, M.D., Clinical Staff, The Institute of Psycho- 
analysis, Chicago, Illinois; Associate Professor of Criminol- 
gy, University of Illinois Medical School, Urbana, Illinois. 

Willard С, Olson, Director of Research in Child Development; 
Professor of Education, University of Michigan, Ann Arbor, 
Michigan, : 

anie] Prescott, Professor of Education, The University of Chi- 
„ Cago, Chicago, Illinois. 
ritz Redl, Professor of Psychology, Wayne University, Detroit, 
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Michigan; Division on Child Development and Teacher Per- 
sonnel, Commission on Teacher Education, The University 
of Chicago, Chicago, Illinois. : 
Helen Ross, Ro Associate, The Institute for Psychoanalysis, 
Chicago, Illinois. | | 
Verner М. Sims, Professor of Psychology, University of V as 
Herbert R. Stolz, M.D., Assistant Superintendent in Charge | 
Individual Guidance, Oakland Public Schools, Oakland, Cali- 


fornia. 


TABLE 4 


Judgment as to the comprehensiveness of picture of stu- 
dent, the usefulness of this information to the counselor 
or teacher, 

Statement 


No. of times 
checked 

1. The description of the personality of the stu- 
dent is very clear and comprehensive; it 

should be of real value to a counselor, 15 

- The analysis seems to have come very close 

to several of the central difficulties of the 

youngster; it should be of help to the coun- 

selor. . 

Although the interest questionn 

Obtain a consistent and clear-c 

the student, the study unearthe 

tant hypotheses about him. 

4. The description from the int 
naires is too vague a 
judgment. 

5. The statements in the int 
apply to anyone—there i 
Seems to apply to this you 
and alone, 

6. Many dominant charact 
the school are missed c 
pretation. 


29 


со 


aire did not 
ut picture of 
d some impor- 
12 
erest question- 
nd equivocal to make a 


хо 


erpretation could 
s nothing which 
ngster specifically 


eristics mentioned by 
ompletely in the inter- 
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TABLE 5 


393 


Judgment as to the degree of agreement between inter- 
pretation and data from school. 


10. 


1. 


Statement 


No. of times 


The picture presented is highly consistent with 
the material submitted by the school. 


. There is agreement on important aspects of 


personality, disagreement on the less impor- 
tant. 


- There is general agreement between the re- 


port of school and the interpretation, but the 
interpretation seems to over-emphasize or ex- 
aggerate certain aspects. 


- There is agreement in part, but there is a lack 


of verification by the school on details. 


. There is excellent agreement in some parts, 


whereas in other parts there is marked dis- 
agreement. 


‚ The school gives a “surface” picture of be- 


havior, whereas questionnaire results describe 
“central” or “underlying” behaviors. This 


makes a comparison difficult. 


- There is marked disagreement in most areas; 


only in minor points is there agreement. 


- There is little agreement between the inter- 


preters’ analysis of the major outline of per- 
sonality and the version presented by the 


school. 


. Neither of the reports gives a clear-cut pic- 


ture; therefore, a comparison is difficult. 
Insufficient data from school for making a 
judgment. 

There seems to be no relationship between 
the interpretation and the description pre- 


sented by the school. 
Total 


checked 


17 


Lo 
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TABLE 6 


Judgment as to which picture seems most reasonable, 
or valid in the light of all the information gathered on 
the student. (In cases of disagreement, or in cases in 
which the interpretation goes beyond the material pre- 
sented by the school.) 


Statement 


No. of times 


checked 
l. The interpretations which go beyond the ma- 


terial submitted by the school are psycho- 
logically very consistent with the total picture. 19 
2. The description derived from the interest 
questionnaires seems more convincing. I tend 
to accept it as being more likely to be psycho- 
logically correct. 18 
8. Even though the school’s description of the 
youngster's behavior and the interpretation of . 
his feelings as revealed through the question- 
naires do not seem to coincide, it is very prob- 
able that each is valid at its own level. 9 
4. Analysis seems to have hit upon the central 
themes of conflict, a fact which renders it espe- 
cially valuable for the counselor. 
5. The questionnaire results help to get at some 


of the causes of the picture of maladjustment 
painted by the teachers. 1 


6. The conclusions of the analysis give perspec- 


tive and psychological meaning to teachers’ 
statements. 


bo 


1 
7. The questionnaire results and the school re- 
port supplement each other, though I regard 
the questionnaire as the more valuable psy- 
chologically, 1 
8. The questionnaire interpretation is more pene- 4 


trating than the school material, The school | 
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description, while helpful, has a few incon- 

sistencies; and it is more of a surface descrip- 

tion. 1 
9. The description presented by the school and 

the description derived from the interest ques- 

tionnaires supplement each other to form a 


consistent picture of the student. 19 
10. There are too many contradictory statements 

from the school to make a judgment. 1 
11. Insufficient data from school to make a judg- 

ment. 1 
12. There are too many contradictions in the ma- 

terial to make a judgment. 2 


13. The description presented by the school seems 
more convincing or plausible. I tend to accept 
it as being more likely to be psychologically 


correct, 9 
Total 84 


These three tables indicate a preponderance of opinion in 
favor of the inferences about students drawn from the ques- 
tionnaires. Of 194 judgments which may be classified as 
favorable or unfavorable, 157 favor the questionnaires, while 
87 express some criticism or indicate a preference for the 
materials presented by the school. Of the latter, 31 express 
only the following criticisms: many dominant characteristics 
mentioned by the school are missed in the interpretation 
(7), the interpretation seems to over-emphasize or exag- 
erate certain aspects (6), there is little agreement between 
the interpretation and the school’s version (8), and the 
school’s description seems more plausible (9). Some of these 
were not intended as criticisms for they were frequently ex- 
pressed by judges who preferred the version given by the 
interpretation. When it is recalled that the material pre- 
sented by the school was the result of several years of close 
association with and study of students, while the interpreta- 
tion was based on three short tests by investigators who had 
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never seen these students and knew nothing else about them, 
the preponderance of critical opinion in favor of the ques- 
tionnaires is encouraging. 


POSSIBLE USES or THE QUESTIONNAIRES 


It may be well to indicate at this point that paper and 
pencil interest questionnaires do not necessarily constitute 
the best method of studying interests. It is possible that skill- 
fully conducted interviews, direct observation, etc., may yield 
much richer, more dependable material. On the other hand, it 
may be that one of the advantages of a questionnaire is the 
fact that a mass of comparable data are secured on a large 
number of students at one time. This material can be used 
for studies of individuals or for studies of groups or for 
studies of shifts of interests occurring with age in boys and 
girls. 


Value to the Counselor 


l. It is expected that persons who work out a few of the 
individual interpretations and who begin to see the intimate 
relationship between the so-called “academic” interests and 
the emotional dispositions of the individual, will begin to 
view the in-school behavior of youngsters quite differently. 

2. The questionnaires afford the opportunity to look at a 
student from a new angle—the expression of his likes and 
dislikes in a great many areas. These one examines in terms 
of the individual and in terms of how he compares with the 
other members of the group. 

3. The questionnaire results suggest a number of hypoth- 
eses about the student—point to directions which ought to 
be investigated. The questionnaires are expected to serve 
the function of a time-saving device since they point out 
specific areas which have to be investigated first. Such in- 
vestigations are not blind trial-and-error searches for infor- 
mation, since they are based on an hypothesis and since the 
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area investigated is naturally connected with some aspect of 
the student which is of importance to the educator. 

4. On the basis of the information derived from the pic- 
ture of the interests and on the basis of the information ob- 
tained from other sources, it is expected that courses of ac- 
tion will suggest themselves. These remedial steps will be 
based on a knowledge of the student's abilities, on a knowl- 
edge of his academic interests, and on some facts regarding 
his personal and social adjustment. 

The question of the extent to which it is legitimate to 
discuss with students their scores is being asked repeatedly. 
Some teachers even feel that a description of a student de- 
rived from the questionnaires should be read to the young- 
ster. Those who have worked with the questionnaires take a 
very definite stand on this point. It is felt very strongly about 
8.2b and 8.2c that the scores should never be shown to a 
youngster, just as the youngster is never shown his Intelli- 
gence Quotient. There are two main reasons for taking this 
stand. 

In the first place, by making the students self-conscious 
about the questionnaire, by revealing to them the nature of 
the categories on which they expressed themselves, one 
would spoil the chances for administering the questionnaires 
again. The next time the answers would be apt to be much 
less spontaneous; the student would tend either to give the 
teacher what he thinks the teacher wants him to give, or 
give whatever ideas he has regarding his liking for a given 
category as such. It would be very similar to giving the stu- 
dents the key to questionnaires and asking them to respond 
to items as they are arranged under the various categories 
instead of having the statements in a random order. This 
consideration applies to 8.2a as well as to 8.2b and 8.2c. 

The second reason which makes letting the students see 
their own scores seem undesirable is the injury which this 
may do to them. When one constantly sees adults who take 
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numerical scores, medians, etc., as if they were absclute and 
infallible realities, one can easily imagine the damage which 
may be done to a youngster who would suddenly be con- 
fronted by the fact that he scored way below the median of 
the class in liking his family or that he came out highest in 
the class in disliking it. Even if the scores were absolutely 
correct representations of youngsters’ feelings, pointing them 
out to the student would not alter these feelings, but would 
be apt to increas? the self-consciousness and, therefore, the 
conflict about these feelings. There seems to be a very com- 
mon misconception in the minds of many people that the 
mere pointing out of a fact toa person has therapeutic effects. 
This misconception may be due to two things. In the first 
place, it is true that in relatively simple matters, pointing out 
a fact to a person often makes this person watch himself in 
this respect or makes him actually change his behavior. For 
instance, when a student consistently misspells a word or 
has difficulty in constructing a sentence, pointing out his 
shortcoming to him may have beneficial effects. In the area 
of feelings or emotions, however, the pointing out of a ten- 
sion or conflict or the pointing out of a symptom of a tension 
often tends to aggravate the situation, 

ace, this misconception may be due to an 
"insight," which is fre- 
ature. Contrary to the 
e worker, psychologist, 
to his client, but, when 
ient that the latter gains 
instead of allowing the 
y strengthens the block 


ve some qualms about an un- 
less be able to gain cer- 
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tain insights which will assist him or her in manipulating the 
environment of the youngster as a means of making it easier 
for the student to make the necessary adjustments. 

It is somewhat less dangerous to let students see their 
Scores on 8.2a. In certain situations this may be permissible, 
much depending on the tvpe of youngster one is dealing with 
and much depending on the relationship between the stu- 
dent and the interpreter of the questionnaire. One should be 
always cognizant of the fact, however, that such a discussion 
is almost certain to make it impossible to give the same ques- 
tionnaire again. Moreover, the student is apt to take his score, 
as compared with the median of the class, as evidence of a 
permanent characteristic of himself, perhaps as evidence of 
an inherent lack of interest in the subject, perhaps even as 
evidence of his inability to do well in this area. Trying to 
Correct this by telling a student: “Now just snap out of it, 
John, you can be interested in this as much as anyone else!” 
can hardly be expected to stimulate a real interest. 

In cases of students who are really eager to learn more 
about themselves and their performances on the question- 
naire, it is suggested that, without showing them their actual 
Scores and the median of the class, one could pick out the 
highest interests of the individual, mentioning to him that 
they seemed to be his highest interests and pointing the dis- 
cussion in the direction of what this student actually enjoys 
doing, what he actually enjoys at school, etc. The areas of 
low interests, as revealed by the scores, do not have to be 
discussed with reference to the questionnaire but may come 
up for discussion naturally, as the outcome of the whole con- 
versation. The above approach in which one starts with the 
area of outgoing feelings and interests of the student is 
thought to be much more positive. This positive approach 
is apt to make the whole discussion a pleasant and spon- 
taneous one and is apt to cement the relationship between 
the counselor and counselee rather than create a breach. 
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The Administration of the Questionnaires 


Questions relative to the method of administration have 
been brought up by a number of teachers. Some seem to feel 
that the situation under which the questionnaires are admin- 
istered has a great deal to do with the results. 

It is thought best to present the questionnaires rather cas- 
ually, perhaps as part of a survey of the school or as part of 
a study of pupils’ interests. Certainly the validity of the re- 
sults is considerably reduced if one tells the students that the 
school wants to find out “everything about their personali- 
ties” or if one singles out a troublesome student and lets him 
take the questionnaires by himself or under the immediate 
supervision of some stern adult. Preferably the question- 
naires should not be given at a time when they draw the 
students from an activity which they particularly enjoy. 
Their resentment will probably reflect itself in their re- 
sponses. The traditional “test” situation should be avoided as 
much as possible and every effort should be made to make 
it a pleasurable experience. 

The fact that most of the items in the questionnaires were 
furnished by youngsters indicates that frank statements can 
be obtained from them. The fact that such responses can be 
obtained only by a person in whom the children have com- 


plete confidence, because of this person’s t 


act in dealing with 
their feelings, must 8 


also be borne in mind. 
SUMMARY 


In concluding this chapter it may be well to point out some 
of the main features of the present technique of study of 
personal and social adjustment. These features may be sum- 
marized as follows: 


ore they do not arouse the anxieties which 


such tests evoke. They have been found to be actual 


many 
ly en- 
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joyed by a great many children. Most of the items in the 
questionnaires have been obtained from children’s diary 
records of their daily activities. Whenever possible, young- 
sters’ language was preserved in the inventories. 

2. Flexibility. The inventories do not attempt to discover 
whether the student does or does not fall into one of a group 
of patterns prearranged by the investigator. Rather they at- 
tempt to provide a field upon which, with certain limitations, 
the student may trace his own pattern or profile. The sub- 
jects are thought to reveal their various affective trends 
through the configuration and the interrelation of their re- 
sponses. 

8. Aims at a dynamic instead of a static picture. This 
method attempts to reveal how a student operates or func- 
tions, what adjustive devices he employs, how he feels about 
various activities. This aspect of the method is expected to be 
of particular practical usefulness. 

4. Aims at gaining insight into students’ motivation. In- 
sofar as it is possible through the examination of specific 
responses to discern common elements in new groupings of 
likes and dislikes, one is frequently able to see what lies 
behind these feelings. This gives useful clues as to how to 
motivate the student’s interest in some other activities. 

5. Tends to make a student's academic likes and dislikes 
understandable in terms of the organization of his person- 
ality. It is felt that only too frequently there is a dichotomy 
in our concept of a personality. The thinking life of a stu- 
dent is thought of as a discrete, separate unit determined by 
his LQ. and "special abilities" and unrelated to his needs, 
drives, and goals. The approach outlined above aims to bring 
to light certain common trends in the individual which evi- 
dence themselves both through his academic interests and 
other activities. Should it be possible to give a classroom 
teacher an instrument which will enable her to relate the 
strivings and the goals of a student and the possible satisfac- 


402 ADVENTURE IN AMERICAN EDUCATION 


tion of these goals to work on certain academic problems, the 
opportunity to make education meaningful to children would 
be increased greatly. 

6. Final results are descriptive rather than definitive. In- 
stead of having the final picture a score or series of scores, it 
is a brief personality sketch or study. This sketch is derived 
from the way in which the individual student reacts to a 
great many fields of activity: academic interests, sociable ac- 
tivities, and activities which indicate his attitudes toward 
himself. 

7. Questionnaire results are inferential. The present ap- 
proach should not be thought of 
ment which is meant to give conclusive evidence regarding 
a student's personality. The results are inferential. The inter- 
pretations should always be regarded as hypotheses which, 


when combined with other information on the student, might 
prove useful to the counselor, 


as a “test” or as an instru- 


Chapter VII 


INTERPRETATION AND USES OF 
EVALUATION DATA 


Касеке ССС ССС САС 
The preceding chapters have explained the development of 
evaluation instruments in several major areas of objectives. 
References to methods of interpretation and uses of these 
instruments were confined to single instruments or pairs of 
instruments. Other problems of interpretation and uses were 
encountered when a whole program of evaluation was de- 
veloping. The present chapter is devoted to these problems. 

Methods of interpretation and uses of evaluation data were 
determined largely by two factors. One was the conception 
of the functions which interpretation was to serve; the other 
was the character of the data and the assumptions on which 
they were based. 


Functions of Interpretation 

Since the main purpose of evaluation was to help teachers 
improve their curriculum and guidance, the first function of 
interpretation was to translate the evidence from columns of 
figures into descriptions of behavior which were intelligible 
and useful to teachers for this purpose. Such translation oc- 
curred on three levels: single scores or bits of evidence, 
whole instruments, and batteries of instruments. 

At the first level, even a single score on a test usually car- 
ried no self-evident meaning. What, for example, did a score 
of 11 per cent on crude errors in the test on interpretation of 
data mean? It seemed to be low (desirable); it was actually 
high (undesirable) as such scores went; but in a group which 
had had little training in this ability, it might he below 
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the median, and better than was to be expected from this stu- 
dent. Thus each score had to be translated, at least in the 
mind of the interpreter, in terms of the behavior which it 
represented, 

Each score, however, was only a part of the larger pattern 
of behavior revealed by a given instrument. At the second 
level of translation, therefore, each score had to be inter- 
preted in the light of the other scores on the same instru- 
ment, in order to see the larger tendencies in behavior in this 
area and their dependence on one another, 

This process was continued with scores from a battery of 
instruments at the third level of translation. Thus, scores in- 
dicating inability to get accurate meaning from quantitative 
data, combined with evidence of general ability in logical 
discrimination and skill in quantitative techniques, might in- 
dicate that the difficulty lay only in failure to devote the 
necessary attention and persistence. 

This level of translation made possible the second func- 
tion of interpretation: 
possible causes of the s 
and groups. To locate 
sider not only all 
also the history o 
relevant factors in 
entirely possible 
when teachers had 

Finally, it was 
hypotheses regard 
situation. This was a ste 
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human behavior and the methods by which that behavior 
could be controlled and changed. 


The Nature of the Data and the Assumptions Underlying Them 

The process of evaluation was composed of two elements 
which on the surface seemed contradictory, and which tradi- 
tionally had beer. held to be contradictory. In the first place, 
any form of appraisal is essentially an analytic process. To 
see each individual clearly and accurately and to observe the 
differences among individuals more precisely, it was neces- 
sary to break up larger complexes of behavior into their com- 
ponent parts and to get as accurate measures of each as 
possible. 

Thus, in the course of the Eight-Year Study, reference was 
often made to “breaking up” objectives. Separate instru- 
ments were constructed to appraise each area of objectives, 
and in many cases each aspect of specific objectives. This 
type of approach could easily be identified with “atomism,” 
that is, with an assumption that human behavior is composed 
of isolated reactions, each of which can be understood, ex: 
plained and appraised as a separate entity. 

However, evaluation in the Eight-Year Study has also ad- 
hered to the second, synthesizing function of appraisal. One 
of the most influential psychological principles guiding the 
work has been the assumption that the essential character- 
istic of human behavior is its organic unity, and that various 
aspects of it function in close relationship with each other. 
It was clear that no single aspect of human behavior would 
be understood without reference to the total pattern of be- 
havior, Similarly, it was clear that usually no single type of 
growth could be fully achieved without some progress in all 
others. While an uneven development was expected toward 
Certain objectives, such as thinking, attitudes, interests, social 
adjustment, and so on, no one aspect should be developed too 
far without some growth in other important aspects of de- 
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velopment taking place at the same time. Thus, if logical 
thinking were cultivated without much attention to emo- 
tional and social maturation, not only would the development 
of thinking be handicapped; personality maladjustments 
might also appear as a result of too uneven a rhythm of 
growth. Similarly, the possibility of rational and objective 
social attitudes was greatly limited unless a certain degree of 
maturation took place in social interests. 

This basic assumption found expression at several points 
in the development of the evaluation program. One of these 
was the conception underlying the comprehensive set of ob- 
jectives. The areas of objectives described in the first chapter 
were not chosen arbitrarily or accidentally. In formulating 
objectives and in classifying them, an effort was made to 
include such a range of the significant aspects of human 
growth that, taken together as goals of development, the 
areas of objectives would represent a unified and related de- 
velopment of the whole person. Thus the term “comprehen- 
sive” used in conjunction with objectives referred primarily 


to the range of aspects of human growth viewed as an or- 
ganic unit. 


The idea of related 


the structure of the instruments developed as well as in plan- 


ments. Thus, each instrument at- 
ern of closely related behavior 
behaviors. For example, in de- 
the ability to apply social values 
n analysis was made of the be- 
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qualities were shown at the same time. Further, the question 
of the nature of their values entered. A broad and compre- 
hensive awareness of values and their implications might 
involve a consistent or inconsistent, homogeneous or 
ambivalent pattern of those values. This pattern might be 
what is commonly called “democratic,” or “undemocratic.” 
Recognizing the relationship of these three types of reactions, 
namely comprehensiveness, logic, and values, it was neces- 
sary to construct a test permitting the diagnosis of each of 
these behaviors in a context involving the others. The test 
provided for each type of reaction and permitted a descrip- 
tion of them in their relationship to each other. 

While each instrument was constructed to appraise specific 
behavior related to specific objectives, the relationship of 
these behaviors to the total behavior pattern of an individual 
was not forgotten. In many cases instruments were frankly 
devised as “mates” to each other, because it was clear that 
the behaviors measured by them were strongly influenced by 
each other, or because it was recognized that certain kinds 
of behavior needed to be checked in different content. Thus 
the instruments measuring general social beliefs were supple- 
mented with others appraising the application of these be- 
liefs in concrete situations and the logical thinking involved 
in such a process. The evaluation of free reading was con- 
ducted hand in hand with the evaluation of responses made 
to that reading. Information and application of information 
were found to be importantly related and some instruments 
appraised both with reference to the same content. Similarly, 
recognition of the strong relationship between interests and 
thinking made it necessary to secure evidence on interests in 
all areas in which logical thinking was appraised, so as to be 
able to diagnose weaknesses in thinking in relation to in- 
terests in the same areas. 

Often an effort was made to secure supplementary evi- 
dence from a series of instruments on certain characteristics 
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appraised directly in one instrument. Thus the tendency б 
go beyond data or to be overcautious was directly measure 
in the test on interpretation of data. Supplementary evidence 
on the same tendencies could be gained from other tests also. 
For this reason some scores were retained even though their 
statistical reliability as separate scores was low, for the reli- 
ability of the conclusions increased as the same tendency 
was shown in many different instruments. 

Thus, in a sense, the series of major instruments composed 
a related battery. Each instrument was a part of a compre- 
hensive plan for evaluation, designed to correspond to re- 
lated behaviors within a unified pattern of development. 
Thus the synthesizing function of evaluation was expressed 
in the structure of the instruments as well as in the relation- 
ship of the instruments to one another. 

As a result, what the interpreter found was not a series of 
isolated data, but a series of data which fitted into a pattern 
of behavior relationship. His job was facilitated because the 
required synthesis was not to be brought about from a plan- 
less series of isolated bits of evidence, Certain generalized 
relationships were inherent in the very nature of the data. 
His task was to detect the variations of individual and group 
patterns within this general framework, 
Illustrative Case Study 

To illustrate the problems encount 
reasoning and inference fruitful in sy 
data, a case study is presented on р. 4 
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studies, a difficulty observed by some teacher, a behavior 
problem requiring explanation, or some inconsistency ob- 
served in the data themselves. The nature of the problem 
usually determines at which point the analysis of informa- 
tion begins and what sequence the consideration assumes. 

The case of Jane came to the attention of counselors and 
teachers when they surveyed the data from a battery of in- 
struments prepared by the Evaluation Staff and found that 
the impressions of Jane secured from these data differed 
from the ones prevailing among the school staff. For this 
reason the investigation proceeded first to locate some of 
the outstanding conflicting impressions and then to examine 
data relevant to explaining them. However, the data are 
here presented not in the order in which they were secured 
or analyzed in the school, but in the order of their explana- 
tory value for the subsequent data. 


Background Data 

Jane is a senior in a large public high school and has come 
to it through a junior high school on the same campus. Sev- 
eral teachers have thus known her for some time. She is 
considered an average, normal child, so much so that, ac- 
Cording to the counselor, she has scarcely been noticed. She 
has never created any trouble, has done her work fairly well 
and, except for occasional difficulty with her Latin teacher, 
has behaved as a “good” student. Her LQ. is 120 (Terman 
Stroup) which is in the middle of the range of her group. 


Standardized Achievement Test Scores 
Her percentile scores on standardized achievement tests 
taken over the preceding two years were as follows: 


Year I Year II 


Algebra... .. 55 English Usage.........,... 87 


English, . 84  Spelling....... 

French., 85 Vocabulary 

duds rec. sides us ad qur eae 100 Literary Comprehension. ... 92 
Medieval HiBteby.. сазо im 99 Reading Rate............, 85 


Literary Acquaintance..... 98 


410 ADVENTURE IN AMERICAN EDUCATION 


Apparently Jane has a high level of achievement in the 
usual subject matter skills and information. With the excep- 
tion of algebra and spelling, her scores are at or above the 
84th percentiles. 

Two questions suggest themselves at this point. First, one 
notices that her scores on mathematics and spelling are con- 
spicuously lower than the others and one wonders what may 
be the cause for that. Secondly, one is curious about how 
Jane’s standing in the class on achievement scores compares 
with her abilities as measured by intelligence test scores. 

Examination of the range of scores for the group revealed 
that Jane tends to stand higher on achievement tests than 
on the intelligence test scores. One notices also that the areas 
of her high achievement are areas of high verbal content 
which suggests a special proficiency with words and possibly 
difficulties with areas and processes requiring the use of 
other techniques and symbols. 

Teacher Reports 


A look at the teachers’ re 


ports to her parents reveals the 
following: 


Algebra—Teacher has little to say, except that Jane has diffi- 
culty with learning mathematics, especially when it comes 
to application of quantitative concepts to practical problems. 


English —In general, Jane understands what she reads. Some 
of the modern poetry presents difficulty. She needs to in- 
crease her speed of reading. As far as free reading is con- 
cerned, she shows "appreciation, acquaintance, and scope in 
her reading." Her literary background is satisfactory, espe- 
cially with reference to literary criticism. When in a hurry; 


тае makes unreasonable mistakes in spelling. “Jane knows 
etter. Organization of materials is excellent and presenta 
tion acceptable. Excellent work habits. 


French—Reads with comprehension, speed, and accuracy: 
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Has good memory for words. Understands and remembers 
grammatical principles. Reads smoothly and knows rules of 
pronunciation. Responds orally in fluent speech. Written 
work could show improvement in application. Is much inter- 
ested in foreign people and their contribution to civilization. 
Does individual research work in music for her own pleas- 
ure. Work habits excellent, though lack of preparation was 
evident in the last two tests during the two weeks preceding 
the report. Has intellectual interests in Romance languages 
and their development. Is studying Spanish in her leisure 
time and corresponds with a foreigner in that language. 


Latin—Has keen power to get thought from foreign lan- 
guage without translation. Vocabulary is very good; gram- 
mar and pronunciation good. In applying fundamentals, 
written work is better than oral work. For the past six weeks 
has made no effort to do more in silent reading than the 
minimum requirement. Is unique on occasion in applying 
historic-cultural materials, but frequently fails to come 
through. Work habits are bad. Does not pretend to do things 
On time. Intellectual interests sometimes very high, some- 


times very low. 


Social Studies—Good mastery of such skills as reading, map 
Work, use of graphs and charts, library books. Knows a satis- 
factory number of historical facts. Reads more than average, 
though mostlv nature books. Work habits are steady and 
persistent. Has intellectual interests in cultures different from 
her own. 


A few things stand out in these reports. First, with the ex- 
Ception of reports from the teacher of Latin, teachers re- 
ports are consistent with the results of the standardized tests. 
The mathematics teacher reports difficulty with algebra, and 
the English teacher comments on Jane's "unreasonable" 
spelling. One wonders whether the teachers’ reports were 
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based on or influenced by the achievement tests, but the re- 
ports were written before the tests were given. The fact 
that the Latin teacher reports difficulty with Jane’s work 
habits, while her achievement score in Latin is very high, 
suggests several possibilities, First, the Latin classes may 
emphasize objectives not measured by the achievement test. 
The Latin teacher may have been unduly influenced by 
Jane's slump during the last six weeks, and may be apply- 
ing pressure to get her out of it. Jane may also have had 
Some special difficulty with the teacher which may have in- 
fluenced the teacher’s observations. Finally, Jane's. profi- 
ciency with words may have caused her to be bored by the 
class work, which she mastered all too easily. Each one of 
these points can be checked easily enough in the school 
situation. According to the counselor, the Latin teacher was 


the only one who insisted that Jane develop a modicum of 
precision and care with 


satisfied with more gene 


, and occasionally English, place her 
trait than do the teachers of mathe- 
articularly the latter, Thus, in assessing 
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highest as well as the lowest ratings. This suggests several 
possibilities. First, the teachers may have had insufficient 
Opportunity to observe Jane on all characteristics, and there- 
fore may have given somewhat invalid reports. The teachers 
may also have rated Jane according to her achievement in 
the class, thus being influenced by what is called a “halo 
effect.” It may also be that Jane’s difficulties in academic 
achievement influenced her personal relations with each of 
the teachers concerned and hence affected her actual be- 
havior in class. 


Summary of Counselors’ Interviews over Two Years 

Due to the loss of her parents, Jane lives some distance in 
the country with her grandmother and aunt. She has con- 
sequently had little companionship with other children and 
is thrown a great deal with older people. Moreover, the 
grandmother and the aunt do not get along well, and Jane 
feels that she often has to take the brunt of their differences 
with each other. Jane feels that her ideas are “foreign” to 
those of her grandmother and aunt, and she suppresses them 
at home, “for the sake of peace.” When the difficulty with 
her work habits in Latin was pointed out to her, Jane said 
She was in the habit of leaving work to the last minute and 
rushing through with it, a habit indulged in by many "bright 
Students.” Since she got good grades, “why bother?" As to 
her difficulty with Latin, she felt that she could get more 
Out of the language by herself. 

Concerning her personal life, Jane confesses that she can- 
Dot work with other people, because of her unwillingness to 
accept suggestions. She also talked about having temper tan- 
trums and throwing things around in her room. These tan- 
trums were referred to in both interviews, a year apart. She 
has only a few friends. One of them, a Jewish girl, whom 
She admired very much, she was forced to desert on the in- 
Sistence of her other friends. 
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Her vocational plans are undecided. In the tenth grade 
she expressed interest in history and archeology, and the 
next year in languages. She wants to go to Stanford Univer- 
sity, however, because “the climate suits her health and the 
architecture her temperament.” This is contrary to the wishes 
of her family, who want her to enter Bryn Mawr. She has 
had no vocational experiences. 

Summer vacation activities include a trip to Mexico (sub- 
sequent interest in Spanish), summer high school work in 
Spanish, and the study of Italian by herself. 

Recreational and club activities are limited in number and 
are mostly solitary in nature. Orchestra is the only club ac- 
tivity in school, which is less than average for high school 
students. Athletic experiences include riding, swimming, 
cycling, and walking. She hates and fears “gym.” She listens 
to the radio, reads, and attends a few movies, and confesses 
she does not know how to play. She reports that her health 
is good. 

This record reveals several adjustment problems and their 
probable sources. There is a tendency to withdrawal and a 
certain degree of difficulty in adjusting to other people, both 
adults and those of her own age. These difficulties appat- 


ently have not been noticed by the classroom teachers. Her 


choices of free activities, which do not include many usually 
chosen by girls of her age, concentrate exclusively on soli- 
tary activities. She has few friends, and her relations with 
them are somewhat complicated. Immaturity is shown in her 
vocational plans and experiences. Her reasons for choosing 
a college seem far-fetched and affected. Part of the sources 
of her difficulties lie in her home life. At least the fact that 
she lives out of t 


own, in a household composed of elderiy 
nae P ay be sufficient cause for her lack of contact with 
people her own age, and hene rent ad- 
justment difficulties, iiri e a 
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Summary of the P.E.A. Test Data 


INTEREST INDEX, TEST 8.2a? 


Jane's Class Median 
Category | 
Likes | Dislikes | Likes | Dislikes 
| 
Social Studies. à: cernat 38 0 51 13 
Biology... sss vx 19 0 56 13 
Physical Science. . 25 0 56 13 
[Epit селен каш sri amicae 75 0 63 13 
Foreign Languages....... es 100 0 63 6 
Mathematics. ..... 0 25 43 25 
Business. 142. sco ees 0 0 56 6 
Home Economics. ..... 13 6 44 19 
Industrial Аг... 31 0 44 18 
Fine Arts... ... 88 0 38 12 
Nd om come can nome uses 76 6 56 12 
КЕКС СЫ 12 38 56 18 
MAUTE. ass клы 37 3 44 21 
54 0 58 14 
39 6 52 21 


In the twelfth grade as well as during two previous years, 
Jane’s interest pattern is highly selective. Strong preferences 
are shown in four areas: English, fine arts, foreign languages, 
and music—foreign languages being the highest. These 
choices reveal two types of basic preferences: verbal activ- 
ities and creative activities. The areas having to do with 
life realities, practical activities, and precise thinking are 
conspicuously lacking in her pattern of likes. The general 
tone of her interests in areas other than the ones mentioned 
above is that of indifference. Thus, the activities classified 
as biology, physical sciences, home arts, business and sports 
are a matter of indifference to her. Her total “dislikes” com- 
prise only 6 per cent of all of the items. 


2 For a detailed description of this test and of the meaning of the summary 
categories see p. 338, Chap. V- 
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In the area of sports, however, she shows marked negative 
responses. Her dislikes here are in the highest quarter in the 
class. This is significant, because Jane has few dislike re- 
sponses. Her remark to the counselor about her fear of gym 
corroborates this evidence but offers no explanation. Con- 
sidering the fact that her choice of free recreational sports 
activities is limited to solitary activities, and also the fact 
that there is no evidence of a physical handicap or lack of 
physical skill, one is inclined to suggest that her negative 
reaction to sports occurs at the points of group or team ac- 
tivities. There is also other evidence suggesting that she dis- 
likes and avoids activities involving social or competitive 
contact. Thus, on a previous questionnaire she showed very 
high dislikes on items concerning leadership and sociable ac- 
tivities. One is also reminded of her remarks to the counselor 


to the effect that she could not work or play with other 
people, ` 


From these facts one develops an hypothesis of a solitary 


girl with a rather concentrated and somewhat narrow range 
of interests, which deviate in many aspects from the average 
pattern for girls of her age. An interesting inconsistency is 
apparent in one spot. Her score on interest in art is high. 
Yet her activity record shows no participation in art activ- 
ities. Her lack of participation in art activities in the school 
might be due to the fact that her school schedule did not 
permit it, but she chose a second foreign language rather 
than art as an elective, and а study period rather than an 
art club. Neither is there any hint of artistic expression 
On another questionnaire she 
except in instrumental music: 
Ponses to free reading do not 
late impressions gained from 
: It may be that her “art” interest 
this interest is "spurious" in the 
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sense of being a symbolic expression of some other difficulty 
or problem. 


Free Reading and Cultural Activities 

Another series of data on her interests and preferences 
comes from her free choice activities and reading record. 
She reads the local daily paper and occasionally the New 
York Times. This latter, she says, is her favorite paper be- 
cause of the book, art, and music notes. She is, however, 
unaware of the political theory favored by the papers she 
reads. (This is rather common, though, among high school 
students.) She spends an average amount of time (four 
hours per week) reading newspapers. Interesting, however, 
are the items she remembers from her reading during one 
month. These deal mostly with music (death of Chaliapin, 
Opening of Robin Hood Dell) and international news (quake 
in Mexico, Hungarian countess married, taking over of Amer- 
ican oil interests by Mexican government, Señora Cardenas 
and her friends giving their jewels to help United States oil 
interests, former Ethiopian ruler paying back dues to League 
of Nations). There are no items of national importance 
among the list of items she remembers, nor does she pay 
any attention to the editorials. 

Her free reading during one sample period of a month 
(May 6 to June 6) included the following books: Wilder, 
Bridge of San Luis Rey; Wallace, Fair God; Lewis, Charles 
of Europe; Sabbatini, Stalking Horse; Ellis, The Soul of 
Spain. These are books about countries other than the United 
States, or by foreign authors. Her reading over a period of 
a year is twice as voluminous as the average for the class. 

er magazine reading is rather average in quantity and 
character, Thus the Ladies’ Home Journal, Saturday Eve- 
ning Post, Time, and Woman's Home Companion are read 
regularly, mainly because they are received at home. The 
only deviation from the usual pattern is the reading of the 
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National Geographic regularly and in full, and the omission 
of the Readers’ Digest. National Geographic was subscribed 
for at her request. At no time has she made use of the period- 
icals in the school library. 

She attends no concerts, which is surprising in view of her 
apparent interest in music, of her proximity to a major or- 
chestra, and the tradition in the region of attending concerts. 
She has attended no plays. She spends a lot of time, though, 
listening to music over the radio, her favorite programs being 
Charlie McCarthy, Ford Sunday Evening Hour, RCA Magic 
Key, Radio City Music Hall, and La Rosa. Archery is her 
only other recreational activity. 

All of this is rather consistent with what was suggested 
by previously given facts about her interests and personality 
pattern. The impression of her preoccupation with the far- 
away and the esoteric is reinforced by her reading selec- 
tions. Her failure to face the “here and now” 


phasized. 


is again em- 


APPRECIATION OF LITERATURE, TEST 3.32 


Category Jane's Scores | Class Median 
Likes Reading... ,.,, = " 100 62 
Wants More Rs 60 75 
Сапа, „о, ‚ү eu 100 55 
zb 35 25 
50 60 
100 70 
100 
CL аа erdum: 84 65 
Non-appreciation ed 15 40 
hu NNNM MM | 1 1 


With the information about her reading interests at hand, 
? For a detailed descripti š i um- 
mary categories, see p, 255, Chap. na test and the meaning of the $ 
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it is interesting to look into her responses to free reading. 
The results from this test conflict at several points with the 
impressions from data up to this point. She apparently likes 
reading very much and is also curious about the background 
of authors and of the settings of literary works. This is con- 
sistent enough with her voluminous reading. She shows a 
much higher than usual tendency to relate what she reads 
to life and to evaluate reading, which is surprising in view 
of her apparent lack of interest in matters concerning life 
realities. As will be seen later, however, she shows little 
ability to discriminate between what is true to life and what 
is not, It has already been noted that while she has indicated 
high interest in the arts, she does not show any strong inclina- 
tion to translate her impressions from reading into other art 
forms, People who are withdrawn and rely much on read- 
ing to secure experience with life are usually inclined to 
respond to reading with a high degree of self-identification. 
This is not the case with Jane. Her score on identifying her- 
self with what she reads is below the median of the group 
and also below the usual scores in the same grade. This may, 
however, be a mark of sophistication in reading, 


CRITICAL-MINDEDNESS IN THE READING OF FICTION, TEST 3:78 


Judicious Hypercritical | Uncritical | Uncertain 


% % % % 
40 36 33 25 
70 18 22 5 


According to the results from this test, Jane is not very 
Successful in distinguishing realistic life situations from the 
dramatic or melodramatie ones. Her recognition of lifelike 
situations (judicious decisions) is the lowest in her group. 


* For the description of this test, see p. 266, Chap. IV. 
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She also has a strong tendency to be hypercritical: to judge 
situations and behaviors which are usually considered = 
to-life as the opposite. These data support the impression ў 
her lack of experience with life realities, апа her aime ts 
in dealing with them. At many points she finds it impossible 
to make up her mind. This test is not good enough to be 
conclusive on this point, but it gives rise to some doubt „е 
her literary judgment, in spite of her voluminous reading an 
her high score on disposition to evaluate reading. 


INTERPRETATION OF DATA, TEST 2.515 


Category Jane's Scores Class Median 
General Ассџгасу.................. 54 57 
Accuracy with Probably True and 
Probably False................ 35 38 
Accuracy with Insufficient Data... . 51 58 
Accuracy with True and False. .... 76 75 
Overcaution 48 21 
Going Beyond Data. . 43 36 
Crude Errors 11 8 


In techniques of gettin 
requiring precise thinkin 
class. Her scores on accu 
This indicates inability t 
An examination of type 
age tendency to go 
ignoring the limitat 


g meaning from quantitative duta 
g, Jane is near the average for her 
racy are slightly below the media”: 
О recognize the limitations of дан 
pes of errors shows а greater than Dun 
beyond the data, or accept generalitie 
ions of the data. Not only is this 5С076 
among the highest in the class (significant, ‘since most 0 
her scores are close to the median), but the proportion 
errors in this direction in comparison to those in the dire? 
tion of overcaution is also larger than that of the class (Ве 
yond Data: Overcaution — 43:18, Class — 36:21). Ha 
score on crude errors is one of the highest in the class- 


"For a detailed descript; 
Chap. II. Рено, quf. (th 


, 
ese summary categories, see P- 
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In view of her fairly high accuracy in determining the 
absolute truth or falseness of inferences, her inaccuracies in 
judging trends and probabilities may have been the result 
of somewhat careless reading, particularly in view of the pre- 
vious hints of difficulty with details requiring precise work 
and application, such as low scores on mathematics and dif- 
ficulties in areas where detailed application and precision 
was demanded. However, there is strong enough evidence 
that Jane does not have the techniques necessary for precise 
manipulation and judgment of trends. There is also sufficient 
evidence that in instances where she does not get accurate 
meaning from the data, her tendency is to overgeneralize 
rather than to undergeneralize. The possibility of lack of 


к: 6 
ABILITY TO APPLY PRINCIPLES OF LOGIC, TEST 5.1 


Jane’s Scores Class Median 

Definitions 

Right Conclusions 6 4 

Right Reasons........++++ 2 2 
3 IGI, Ls маз vixisse SER gcn 8 7 

ndirect Argument 

Right Conclusions. ... 0 4 

Right Reasons... . 0 0 

SUG. is кед» келю агае ТИН ded 0 4 

idicule 

Right Conclusions 6 6 

Right Reasons... . oo 5 3 

DORT, csse crate cq ice mei Gk THE de 11 9 
IfThen 

Right Conclusions......-++ s 4 2 

Right Reasons. ù... ss 0 0 

poc NNNM TI 4 2 
Total 

Right Conclusions. ..... +++ +++ +++" 16 18 

Right Reasons 7 5 

jos is. naa se ara wears ote OOS МЗ 23 22 


€ For the description of this test, see p. 115, Chap. II. 
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experience is ruled out on the grounds ips while аа 
improved over the period of one year, Tias " pa Ds 
scores showed practically no change. Apparently the exp 
ences provided for the class did not take with a wath 
Apparenily Jane’s ability to apply principles of Ac wil 
as the importance of definitions in arriving at conclusi e 
the recognition of the limitations of indirect argument, d 
fallacy of trying to disprove by attacking the аса н 
the logical necessity of accepting conclusions flowing Gm 
the assumptions one has accepted, is approximately at ES 
average for the class. Her highest score is on recognizing t ; 
futility of ridiculing the opponent as a method of argumen : 
Her lowest score is in recognizing the limitations of indirec 
argument in proof. She seems to use “common sense 1005 
but is not particularly conscious of the principles she арр 
and has not developed finer techniques of reasoning. aee 
the class had devoted a good deal of attention to ару 
principles of logic of this sort, the cause must be sought " 
in lack of experience but in lack of interest or ability. wor 
ently the ability to abstract from the concrete situation whic I 
is required in this test and to draw refined logical distinc- 
tions is not the strong point in Jane’s intellectual make-up. 


Jane’s ability to recognize the logical relationships in argu" 
ments and to discriminate between 


sumptions and irrelevant ones is at the 
However, since in each of the 
port, criticalness—the number 
considerably higher than the 
right, a tendency to a broad 
reasoning is suggested. ( 
in her methods of inter 
Score of “rights” 
large number of i 
right inconsistent 
eral common sen 
niques and cauti 


relevant facts and as- 
average for her group" 
categories—relevance, ш 
of reasons she attempts ! 
number of reasons she get 
and somewhat indiscriminate 
The same tendency was manifeste 
preting data.) Thus while the ae 
in each case is at the median, she uses a 
rrelevant considerations, avoiding the 0Y К, 
and non-critical considerations. Thus, ae 
se combined with the lack of precise tec 
Ousness is again indicated. 
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NATURE OF PROOF, TEST 5.217 


Jane’s Scores Class Median 
| 
General Accuracy... ss cor ser ++ 5 129 128 
Relevancy 
No. Marked i 96 76 
Relevant... 70 69 
ЖЕЛЕУ ШИК. „а ces ens enn waar on ER E 16 6 
Support 
No. Marked... ros cs ene 66 = 
Support... ... 42 2 
Contradict 8 2, 
relevant... ыл» sare coe nr rne s 16 3 
Criticalness 
No. Marked 30 22 
Critical... os i 20 19 
Non-Critical. .. 3 3 
Errelesnpt. ra iii eem nemo 7 2 
Conclusions 
Accepts р 5 6 
Uncertain.. 3 2 
ЖЕГЕ ШЫ... uot meo dos tx ERTS US 1 1 
ualifications 
No. Marked 16 17 
сте Lo terie se esr eed ЧЕТ 10 11 


Apparently Jane's logical abilities are not very high. She 
Seems to fall short on precise techniques in both inductive 
and deductive thinking. Her confession of depending on her 
quick grasp and on a last minute rush to complete her assign- 
ments suggests that throughout her career in school Jane 
may not have taken the opportunity to cultivate precise 
methods of thinking and handling facts. The concentration 
of her interests in the direction of the arts, requiring imagina- 
Hon, and languages, requiring memory, may have in addition 
militated against cultivating these processes. 


* For the description of this test, see p. 181, Chap. II. 
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APPLICATION OF PRINCIPLES IN SCIENCE, TEST 1.38 


2 


Jane’s Scores Class Median 
General Accuracy.................. —9 18 
Conclusions 
Attempted; ; ssa ss ses asina sia no. 12 13 
Righi, „азала ыш» ке» cen cen ma 2 7 
Reasons А 
Айешрїедй...................... 12 18 
Righe аала чыю nS wy ous 3 10 
Unacceptable Reasons 
Technically Еае................ 1 1 
Irrelevant 0 2 
2 1 
2 2 
1 1 
0 0 
0 0 


ak in the knowledge and use of 
s test requiring application of scien- 
tific principles to everyday problems, Jane's general accuracy 
up. Although she attempted a total 


wo were right while ten were wrong. 
Both of these scores are the poorest in the group. Similar 


behavior is shown in her use of reasons. Since the score on 


=" Ee — D 


_e Hc 


| 
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her weakness. The school record shows that Jane was sched- 
uled for a special course in general science in her senior year 
to give her more experience in techniques of precise think- 


SCALE OF BELIEFS, TEST 4.21? 


Jane's Scores 


Class Median 


90 Liberalism 


Déxmponacsr..uu senectt ў 73 69 
Economic Relationships. . . . . 84 38 
Labor and Unemployment. . 76 74 
RETE ar capsam nee GA 94 70 
Nationalism, . . 96 70 
ДЕИ ЕВ =. у avec риши e nen 87 70 
% Conservatism 
Demcraey. аанын E RR РТ 12 17 
Economic Relationships... .. 0 20 
Labor and Unemployment. . 18 10 
FOIS R rade neg нана 0 6 
Nationalism. . . 4 12 
NET Reavis ares insistere oit 3 12 
% Uncertainty 
PMOCEACY o.oo eee ees En екенж 15 12 
Economic Relationships... -=--> 16 28 
Labor and Unemployment 6 12 
HUE E ENS LX 6 10 
Nationalism 0 15 
ИШ... ылышы eae 10 13 
% Consistency " 
Democracy... usouse oneri ittee 65 75 
Economic Relationships ы n 
abor and Unemployment. .....-- o8 30 
90 77 
€ 76 80 
тта ы RT 
Бети, „аз ушшин кш НЯ 83 65 
Сопзегуайыт..........- 7 15 
Uncertaint 10 а 
yes 7 
Сопѕіѕцепсу, sonn isse 77 7 
Vidic scoters qu 


? For the description of this test, see р. 


215, Chap. III. 
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ing—a special concession and departure from general policy. 
However, there is no report of Jane’s achievement in that 
course nor are science information tests included among the 
standardized tests given, Thus, the reasons for Jane's diffi- 
culties with scientific reasoning remain obscure. 

A glance at the picture of Jane’s performance on various 
aspects of thinking in comparison with her achievement on 
information tests opens up an interesting hypothesis. As a 
student of high verbal ability and good memory, has Jane 
been permitted to exploit these two qualities without a suf- 
ficient challenge to other intellectual processes? 

Two tests give data on Jane’s social attitudes. One of these 
attempts to diagnose generalized social beliefs, Jane appar- 
ently has a clearly thought out pattern of social beliefs. Her 
scores on liberalism are high and evenly distributed over all 
of the six areas included in the test, Thus, she tends to ap- 
prove government control on behalf of the general welfare, 
and to reject economic individualism. She accepts equality 
for Negroes and thinks they have the same qualities as white 
people. She favors the international vie 


wpoint, a logical 
counterpart of her interest in foreign cultures. There are 


very few items to which she has responded in a conserva- 
tive direction. She also seems to be rather certain about her 
beliefs. Her responses are highly consistent in all areas, 
though in one of them, democracy, she falls in the lowest 


quarter for the class, because the class has an unusually high 
evel of consistency, 


There is also a m 
the previous year. 
and inconsistent in 


judge, then, that Jane's social beliefs 


are mature and clear and probably arrived at by her own 


efforts. 


APPRAISING STUDENT PROGRESS 427 


SOCIAL PROBLEMS, TEST 1.4119 


Jane's Scores | Class Median 
Comprehensiveness 

Total Courses of Action........... 6 6 
Total Reasons................... 48 46 
3 33 

52 5d 
1 4 
3 9 
9 9 
7 Я 
4 5 
0 0 
2 2 
3 5 
24 26 


In test 1.41 the task is to apply social values to controver- 
Sial social problems. Here, also, Jane shows a preponderantly 
emocratic outlook. Sixty-two per cent of the tenable reasons 
she has used to support the courses of action she chose are 
What are defined as democratic values. She applies them 
Consistently, only 3 per cent of her responses being contra- 
Ictory to the courses of action she chose. She also shows a 
higher degree of cautiousness here than on any other test, 
145, a larger ‘than average fraction of the reasons she at- 
tempts are applicable to the courses of action she chose, 
16 range of the implications that Jane sees is average for 
the group. 
. Apparently Jane does much better with forms of reason- 
Ing Involving. broad generalizations and general logical dis- 


For a description of this test, see P- 180, Chap. III. 
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tinctions. One is also impressed and surprised by the coher- 
ence of her social outlook in comparison to the apparent 
immaturity of her personal philosophy and her personal 
goals. 


SKILL IN USING LIBRARIES AND BOOKS, TEST 7:2 


Jane’s Score | Class Median 
References 

12 

9 

15 24 

6 

4 

8 14 
Sr ote — wane оа Басе 9 

1 

17 17 

1 

2 

0 9 

8 

0 

16 13 

7 

3 

11 | 16 

6 

4 

8 9 

75 99 


In skills in the use of 
marked weaknesses. Exce 
catalog and the use of in 


libraries and books Jane shows 
pt for her knowledge of the card 
dex information, in which she is at 
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the median for the group, she shows marked deficiencies, 
particularly in knowledge of the use of the Reader's Guide. 
Her total score is the lowest in the class. Again a deficiency 
with techniques of work is indicated. 

By way of general summary, one may point out that Jane 
has good general ability, particularly verbal ability. She has 
a measure of success in logical thinking, but falls down in 
all areas requiring precise knowledge, precise processes of 
thinking or precise skills. In some respects her techniques of 
work seem quite deficient. Her social attitudes are mature 
and liberal. Her interests are highly selective and concen- 
trated on esthetic pursuits, with preference for passive rather 
than productive activities. 

Deficiencies and difficulties seem to be greatest in the area 
of adjustment to other people, both adults and age-mates. 
She seems immature in her attitudes toward herself, other 
people, and work. Her personal goals and ambitions are 
fanciful and show little thoughtfulness. 

Apparently she has had altogether too meager an experi- 
ence in challenging, concentrated work, and has cultivated a 
tendency to take her work and to approach her interests 
Somewhat lightly. 

It is difficult to tell what would have happened had the 
faculty become cognizant of her difficulties sooner. The 
faculty made several efforts to meet her needs during her 
ast year at school. Arrangements were made to send her 
to college away from home (neither Stanford nor Bryn 
Mawr) with the proviso that she live in the dormitory, 
Special science work was arranged in an effort to give her 
training in precise thinking. To prevent her being lost in a 
arm. she was ган at a large orchestra to a small 

mble, and from mass hockey into a smaller arch- 
x pied > pes e maie its teann” Мав further prog- 
ly be traced in reports on her work in college. 
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METHODS ОЕ INTERPRETING AND Usinc Evatuation DATA 
For Guidance of Individual Students 

As was described in Chapter I, one important purpose of 
evaluation is the guidance of individual students. The tech- 
niques of interpretation illustrated by the case study were 
especially relevant to this purpose. First, the meaning of 
the separate scores had to be clearly understood. The names 
given to these scores, such as “comprehensiveness,” might 
be misleading unless related to the behavior required by the 
test. The meaning of these scores was further determined by 
their deviation from the group average as well as the level 
of expectancy for a given student. 

Second, scores on any test were examined in relation to one 
another to arrive at a central pattern of behavior, In several 
instruments the scores were so dependent upon one another 
that the meaning of any one of them was not clear until the 
others are examined. For example, in the Scale of Beliefs two 
students might both have a score of 50 on liberalism, and one 
might say at first that they were equally liberal. But if the 
first had a score of 40 on conservatism and 10 on uncertainty, 
while the second had a score of 10 on conservatism and 40 
on uncertainty, it is apparent that they were not equally 
liberal. The first had made up his mind on 90 per cent of the 


issues presented in the test and divided his opinions almost 
equally between the liberal and 


7 mm n the interpreter had to be 
aware of the possibility of а considerable shift in the original 
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and false as true) usually indicated a lack of even rudimen- 
tary skill in drawing inferences from data. If, however, the 
Scores on accuracy were high and scores on other types of 
errors low, this score indicated careless reading of qualifying 
phrases in the test statements, rather than a deficiency in 
techniques of interpreting data. 

Interpreting a comprehensive set of data from a battery of 
tests and other instruments presented a still more complex 
task of relating variables and revising the meaning of each 
aspect of behavior in terms of the larger pattern. Thus, since 
interests and social attitudes were known to influence think- 
ing, data on thinking needed to be examined in the light of 
evidence on interests and attitudes. Formulation of tentative 
hypotheses of explanation usually helped sharpen the exami- 
nation of evidence that might be thus related. In formulating 
these hypotheses the interpreter was first assisted by the 
structure of the instruments presented in this report, for they 
were designed to reveal relationships between different types 
of behavior as well as possible causes of deviant behavior, 
Thus the tests of clear thinking provided some neutral, scien- 
tific problems and other problems in areas involving per- 
Sonal values and beliefs. If errors in reasoning were con- 
Centrated in the latter, the tests of attitudes and interests 
might show that the difficulty lay in lack of interest or in 
Prejudice rather than in techniques of thinking. 

Familiarity with common patterns of behavior in the school 
threw further light on the behavior of individual students, 
An ambivalent pattern of social beliefs might be only the re- 
Sult of conflict between the values emphasized by the school 
and those held by the community, and therefore might not 

€ very significant in individual cases. If, however, the usual 
pattern of social beliefs in the school lay in one direction 
while an individual's pattern lay in another, this is significant 
for individual guidance. Similarly, if dislike of writing was 
prevalent throughout the school due to overemphasis on 
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written assignments in all classes, even a moderate exception 
to this general rule assumed significance. à 
This sort of interpretation was essentially a process o 
postulating several alternative hypotheses to explain deviant 
behavior, and of checking each hypothesis against other 
data to see which one was most likely to be correct. Once the 
most probable causes of important weaknesses were located, 
it was a problem for the counselor and the school staff to de- 
cide how serious the difficulty was for a given individual 
and what, if anything, needed to be done about it. Illustrative 
guidance procedures have been suggested in connection 
with each instrument as well as in the case study. Individual 
variations were too great to permit a comprehensive account 
of all possible constructive methods. The results of a con- 
sistent program of evaluation over a period of years suggested 
that certain methods work better than others in similar cases. 
However, it must be remembered that evaluation data alone 
could not solve the problems of teacning or guidance. They 
only provided a more adequate basis for solving them. Teach- 
ers were sometimes annoyed when a program of evaluation 
revealed certain weaknesses in their program or in some of 
their students without indicating precisely what was to be 
done about those weaknesses. They sometimes concluded 
that the tests were useless. This is like saying that a ther- 
mometer is no good because it does not tell what to do about 
the weather. Tests could not be expected to solve all the 
problems of education, but they could and did call attention 
to many of the problems to be solved. 
For Checking the Effectiveness of 
Curriculum in Achieving Major Objectives 


Another important purpose of evaluation was to discover 
whether the school was a. 


chieving its major objectives. Most 
schools wanted to develop citizens who could think clearly; 
who had democratic social attitudes, who were well adjusted. 
and the like. Evaluation data indicated the degree to which 
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changes of this sort were taking place. For this purpose in- 
terpretation of group data was necessary. 

In the main, the processes employed in interpreting group 
data were similar to those employed in examining data on 
individuals. In each case it was necessary to determine the 
meaning of individual scores by reference to a more general 
pattern. In both cases hypotheses formulated at any point 
were checked against further evidence. 

The usual method employed in locating strengths and 
weaknesses of a whole group—namely, considering the aver- 
ages and the distributions of scores—was used with these 
data. By this method it was possible to determine the status 
of the group in the separate aspects of behavior measured by 
each instrument, such as the ability to distinguish facts from 
assumptions, or the tendency to mistake popular misconcep- 
tions for sound scientific principles. Frequently, however, it 
Was necessary to determine also which combinations of be- 
havior were common to many students in a group, thus 
requiring а common treatment. 

Thus in the case of interests, the recurrence of a combina- 

tion of high interest in music and art, or a combination of 
negative responses to English, reading, and foreign lan- 
guages by many students were important kinds of evidence 
for diagnosing the group. Group medians and distributions of 
Scores in each of the separate categories did not yield evi- 
dence of this type. A comparison of the patterns of interests 
of all individuals in the same group was needed. 
А Three types of processes were usually involved in estimat- 
ing the progress of a group: A comparison of the scores b 
&roups in the same grade or by successive grades in the same 
School, a comparison of scores made by groups in other 
Schools with a comparable curriculum, and a comparison of 
Student achievement with the behaviors specified in the 
Statements of*the objectives. 

While the only satisfactory measure of growth was the 
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record of the same class over a period of years, a rough indi- 
cation of the success of a school program was secured at once 
by comparing scores on the same test for successive grades. 
In some areas of objectives, the median of each grade was 
considerably higher than the median of the preceding grade, 
while in other areas, there was no significant difference in the 
grade medians. While the latter might be the general picture, 
particular classes taught by one or two teachers made sig- 
nificant progress. It then became the duty of the school to 
discover the factors which could account for the difference. 

The most convenient method of comparing these scores 
with scores made by comparable groups in other schools 
might have been with reference to national norms. Thus, 
while progress might be shown from grade to grade on the 
test of interpretation of data, the median of each grade might 
stand in the lowest quarter of scores made by all other pupils 
of this grade who took the test. Unless some special factor 
was at work, such as very low reading test scores for the 
school population, this might indicate at once that still fur- 
ther progress must be made before the school’s record could 
be considered satisfactory, 

This method, however, was avoided as much as possible 
in the Eight-Year Study for several reasons, In the first place, 
it was recognized that as long as there were important differ- 
ences in objectives and curriculum practices among schools; 
it would be inappropriate to measure progress by the same 
standards, particularly if these standards represented nothing 
more than an average of the performance of different groups 
under varying circumstances. The pattern of interests in à 
school for foreign students in New York City could not 
necessarily be considered appropriate as a “norm” or desir- 
able pattern of interests for a suburban school in the Middle 
West, and the average of the two patterns might not be desit- 
able for either school. Similarly, one would not expect stu- 
dents in a school which was barely beginning to explore the 
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methods of developing critical thinking to be judged by the 
same criteria as were students who have had long and care- 
ful training. 

Difficulties were also encountered because of the methods 
of using norms to which teachers had been accustomed. The 
national average had been invested with almost magical sig- 
nificance, so that many teachers were too easily satisfied if 
their groups came up to it, even when they might have 
greatly exceeded it, and too easily discouraged if their groups 
fell below it, even though their progress was all that could 
be expected. For this reason, only tables of medians of com- 
parable groups in other schools were made available to the 
evaluation representatives of schools in the Eight-Year Study, 
who were trained to interpret them. These gave a rough and 
admittedly cumbersome method of estimating the relative 
progress of comparable groups, but it was hoped that by this 
very fact a more thoughtful use of norms would be stimulated. 

A third possible method of interpreting scores to indicate 
the success of a program in reaching its objectives was a 
comparison of the level of ability revealed by the tests with 
the level of ability required in life situations. Thus, if the 
use of the correct scientific principles in life problems were 
the objective of the school, and the tests revealed that stu- 
dents accepted a variety of popular misconceptions as scien- 
tific principles, then the school had not done enough in this 
direction, even though all other schools showed a similar 
weakness. 

This sort of interpretation, however, had always to be made 
cautiously, because the level of accomplishment demanded 
by life situations was often a matter of vague conjecture. It 
was thus easy to expect too much or too little of students. 
The present level of achieving these newer intangible objec- 
tives may be too much determined by inadequate methods 
of helping students achieve them. Nevertheless, some com- 
parison of pupils’ performance with life demands was in- 
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escapable if we were not always to rest content with what 
other schools were doing. Perhaps none of them was doing 
enough. 


For Checking Hypotheses Underlying the Program 

A third important purpose of evaluation was to check the 
hypotheses underlying the school program. Often new prac- 
tices were introduced in the hope of producing certain desir- ~ 
able changes in students, These changes might not come 
about, or they might be accompanied by other changes which 
were less desirable. One public school introduced a core pro- 
gram with several purposes in mind, one of which was to de- 
velop better social attitudes. A comprehensive testing pro- 
gram revealed that while the social attitudes developed were 
clearer, more consistent, and more liberal than in most 
schools in the Study, the students had serious difficulties 
with techniques of precise thinking. In drawing inferences 
from data, they exhibited little caution and showed a tend- 
ency to go beyond the data. In applying facts and principles 
they failed to discriminate those which were valid and rel- 
evant from their opposites, Apparently in emphasizing social 
values the school relied too much on generalizations and too 
little upon the careful analysis of factual data. 

In another school the evaluation of reading revealed that 
one group specializing in science and mathematics showed 
a more limited appreciation than all others, including those 
in other grades Specializing in the same field. They found 
little enjoyment in reading; they did not identify themselves 
with their reading or relate their reading to life problems. 
Since this was a marked deviation from the type of responses 
prevailing in the school, the problem was considered by the 
faculty. It developed that a Special course in literature was 
oered to this group. On the hypothesis that science students 
are interested in scientists, this course concentrated on biog- 
raphies of scientists and mathematicians. Since it was not the 
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intention of the staff to narrow the reading interests of these 
students, a broader program was agreed upon. 

Still another school had hoped to develop democratic at- 
titudes by means of a program of extra-curricular activities 
organized by the student council, while conducting its aca- 
demic curriculum in the usual manner. The results of the test 
on Beliefs about School Life revealed that a large majority 
of these students preferred authoritarian methods of class- 
room management, approved of social distinction of all sorts, 
and in general had tendencies toward undemocratic atti- 
tudes. These results called into question the efficacy of this 
program of student activities for the purpose of democratiz- 
ing school life. In the course of an investigation by a group 
of students and faculty members, it was discovered that the 
student council was run by an inner clique. Many of the 
Student activities tended to be exclusive and to have other 
undemocratic characteristics. The active participation was 
limited largely to students in the upper grades. In the light 
of the facts brought out by this study, a reorganization of 
Student activities was undertaken, involving a closer relation- 
ship between curricular and extracurricular activities, 

Such instances indicated that special care had to be exer- 
cised when changes were introduced into the program to find 
out not only whether the intended results were produced but 
also whether undesirable features did not accompany them. 
Even if no major changes had been made, the hypotheses on 
which the school had always operated might be faulty. 
Hence, evaluation data needed to be examined with speci 
reference to the issues underlying the program. 


P Ossibility of Interpretation 
The foregoing discussion may have left the impression 
that interpretation of evaluation data required very unusual 


Insight and patience, and too extensive knowledg 
tion for the classroom teacher to master. There i 


al 


е of evalua- 
S no getting 
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around the fact that a thoughtful interpretation of the evi- 
dence on students’ progress and the effectiveness of curricu- 
lum practices is complex, and that it can be learned only by 
long practice supplemented by careful explanation. Yet there 
is no reason to believe that further progress in getting a more 
adequate picture of pupil growth will ever return to the 
primitive simplicity of school marks. Reducing the amount of 
data secured is no solution, for a few scattered data can only 
raise questions, not answer them. A rich and full program 
of evaluation can suggest answers to a great many questions, 
but only by thoughtful interpretation and not by chance. 
Teachers must learn to get meaning from the extensive and 
well-integrated sets of data now available. Unless somebody 
knows what the scores mean and takes them into account in 
his teaching, it is obvious that there is no point in getting 
them. 

On the other hand, the process of interpretation is not so 
difficult for busy teachers in a large public school as the 


foregoing may suggest. When teachers know the pupils con- 
cerned, hypotheses to account fi 
occur to them. Then, too, if ev 
tinuously, the evidence accumulates gradually, and only a 
few data need be interpreted at any one time, and fitted into 
what one already knows about students. Also, the processes 
which appear elaborate, when written down and explained 
verbally, easily become part and parcel of the common sense 
thinking of thoughtful teachers, Finally, when evaluation is 
undertaken as a common task for the school, with the whole 


faculty cooperating in interpreting the results, the task for 
any one individual is reduced. 


or their test scores readily 
aluation is carried on con- 


Chapter VIII 


PLANNING AND ADMINISTERING THE 


EVALUATION PROGRAM 
REALE EEE EE KEKE 
The preceding chapters have already dealt with many of the 
basic problems in planning and administering an evaluation 
program. They have discussed the purposes of evaluation, its 
basic assumptions, and the steps which must be followed in 
developing appraisal instruments. They have indicated an 
appropriate division of labor among teachers, school officers, 
and experts in evaluation. They have suggested a possible 
classification of school objectives by types of behavior, each 
of which requires a different technique of appraisal. They 
have described instruments and techniques for the study of 
growth toward objectives usually regarded as "intangible," 
such as certain aspects of thinking, social sensitivity, appre- 
Ciations, interests, and personal and social adjustment. They 
have reported in great detail the method of construction of 
these instruments so that teachers might develop others. 
They have indicated, at least by implication, certain charac- 
teristics which are desirable in evaluation instruments de- 
veloped or selected by a school staff. In addition to those 
usually discussed, such as validity, reliability, objectivity, ap- 
Propriateness to age levels, and the like, the characteristics 
Siven special emphasis in this report were the diagnostic 
value of the multiple scores yielded by these instruments 
and the interrelationships of these instruments, so that each 
Score was supported and explained by other scores on the 
same or other instruments. Finally, the previous chapter dealt 
with methods of interpreting and using evaluation data. 

All of these considerations are pertinent to the problem of 
439 
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planning and administering an evaluation program. In addi- 
tion, certain administrative procedures are essential to assure 
the comprehensiveness of the appraisal, to secure the co- 
operation of the entire staff of the school, and to increase 
the practicability of the program. When testing is left to each 
individual teacher, there is likely to be incoordination, and 
the most important objectives—those to which the whole 
school program is dedicated—are frequently overlooked, es- 
pecially since they are usually the hardest to evaluate. Ob- 
jectives which are easiest to evaluate may be emphasized out 
of all proportion to their importance and, as a result, attention 
may be drawn away from other equally important objectives. 
No data may be secured relevant to the hypotheses on which 
the school is operating. Pupils may be overburdened with 
tests in certain departments or at certain times, 

If, on the other hand, the actual conduct of the appraisal 


is left to an evaluation Specialist, there is the danger that 
pertinent data will not re 


ach the teachers who should act 
nay be reported in a form which 
nderstand, and recorded in ways 
ical labor. A most common defect is 
and effort are spent in gathering data, 


(ә 


О interpret or use them for individual 


It is the intention of this chapter to discuss certain prin- 
4 5 of planning and administering an eval- 
uation program which have helped to make it effective in 
participating in the Eight-Year Study. 
vity, no account will be given of the 
gradual development of these practices, and only occasional 
references will be made to the diversity of practice on these 


points now prevailing among the cooperating schools. The 
chapter will attempt to describe а few illustrative practices 
in planning the program, recording the data, and providing 
for their effective use, 
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Planning the Scope and Emphasis of the Program 

Early in the Study it was found that a comprehensive eval- 
uation program required careful, cooperative planning by 
the staff of the school. The data necessary for a well rounded 
picture of individual development, of the progress of the 
group, and of the effectiveness of the curriculum would not 
be secured if the task was left to individuals. Tt was quite 
evident that the staff as a whole must decide what to evaluate, 
what kinds of evidence to secure, and how to go about secur- 
ing evidence and using it. As the first step in evaluation in- 
volves the formulation of the school’s objectives, this coopera- 
tive planning of evaluation began with this step. In order 
to secure a statement of objectives which was representative 
of the work done in the school and thus to make sure that 
no phase of growth really emphasized in the school was neg- 
lected, the whole staff participated in the process of formu- 
lating the basic platform of objectives. Each teacher or de- 
partmental group of teachers submitted a list of objectives. 
These lists were then considered by committees and by the 
whole faculty in order to clarify them further and to discover 
Where there were common emphases and where unique types 
of development were indicated. 

If there was any conflict between the appraisal of the 
School-wide objectives and those held by individual teachers, 
it was rather commonly assumed that the first responsibility 
of the school was to its general objectives. While the principle 
Was never abandoned that the school as well as individual 
teachers should do all they could to study growth toward the 
Objectives unique to the specific courses, the larger principle 
usually prevailed that the study of the most important aspects 
of human development as expressed in the general objectives 
Should be the major concern of a school. The nature and ex- 
tent of the appraisal of the specific objectives was to be 
planned so that it was consistently related to this general 
Program and helped to support and clarify it. 
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Fortunately, the areas of objectives of general concern 
were usually limited in number and thus did not constitute 
too heavy a burden either on the resources of the school or 
the time and tolerance of the students. For example, most 
schools were concerned with one or more phases of critical 
thinking, social attitudes, certain work habits and study skills, 
interests and appreciations, social adjustments, and certain 
types of functional information. Hence, in most schools there 
was sufficient opportunity to carry on additional specific in- 
vestigations of student growth. 

A second major principle governing the planning was that 
appraisal was to be continuous. The adoption of this policy 
meant that the schools had to consider the time and effort 
needed for a continuous check before decisions were made 
regarding what range of objectives would be appraised, 
or how detailed the check was to be. As can be seen later, 
this consideration also determined the calendar adopted for 
the administration of the evaluation instruments. 

It was also clearly understood that it was the program of 
the school and its effects on student growth and not the in- 
dividual teacher or pupil that was being appraised. The 
effectiveness of evaluation is likely to be impaired if the 
evaluation program is conceived by the teachers either as ап 
extension of the usual-examinations and marks in courses OF 
as a means of judging their competence. With the first mis- 
conception, teachers may try to find the strengths and weak- 
nesses of their pupils with the idea of rewardin g the strengths 
and penalizing the weaknesses, accompanied by some exhor- 
€ to do better, but without making any significant change 
E iine m 9 less in the whole school gam. 
меч у Wang teachers may try to just! y 
„= Pin ier coii y eremia мй EAE i 
favored istenes ead = or these reasons the ara 
diagnoses of students end EN which yielded descripti" 

hich, because of this characte 
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istic, could not be easily converted into grades and marks. 
Most of the evaluation instruments used also diagnosed the 
kinds of behavior capable of development only through eo 
certed and cooperative efforts of many teachers overa peste 
of time, and not by the work of one teacher in one course or 
unit of work. | | 

Finally, it was understood that the evaluation program 
was to serve the local needs and purposes of each school. 
The particular emphasis as well as the extent of the Tm 
was largely determined by what each school nee e da a 
for. Thus many schools had set up an experimental program 
on some central hypothesis. Checking that particular Tm 
esis often required a detailed appraisal of certain: speci s 
types of growth, such as in critical thinking, in range “ 
maturity of interests, in social sensitivity. In these rv thig 
evaluation program was planned to give most detaile o 
dence on these points. Local conditions also influence z he 
plans. For example, some schools drawing students. from 
widely scattered places had to concentrate the evaluation in 
the earlier grades on the diagnosis of interests, abilities, and 

asic skills. Still other schools had differentiated sequences of 
Programs, calling for evidence necessary for the placement 
of the students in these sequences as well as for determining 
the relative effectiveness of these programs. Often special 
effort was needed to appraise the acquisition of common skills 
in order to answer the questions of parents and the commu- 
nity who feared that the new curriculum might neglect these 
Outcomes, - 

Certain practical considerations also limited the plans. 
While most schools made an effort to plan the scope and the 
nature of their evaluation programs according to what they 
thought to be important objectives or crucial needs of their 
programs rather than in terms of economy, immediate ayail- 
ability of instruments and techniques, or the ease of their ad- 
Ministration, it was natural that the cloth had to be cut ac- 


444 ADVENTURE IN AMERICAN EDUCATION 


cording to the resources of the school. Thus, financial al 
penses were involved in administering the testing program 
even though much of the scoring was done at the evaluation 
headquarters. Someone’s time and effort was required for 
handling the data, since there was no point in collecting more 
data than could be properly recorded, interpreted, and used. 

In determining how to adjust the scope of the program to 
the limitations of resources, the general principle followed 
was to plan to appraise at least in limited fashion each e 
major areas of objectives before planning a more detailec 
evaluation of a single area. This seemed wise first because it 
was recognized that evidence covering a fairly broad range 
of behavior is needed for proper appraisal of the program of 
a school. The schools also realized that teachers tend to em- 
phasize the areas of development the results of which they 
can see more clearly. An even distribution of efforts of ap- 
praisal over the significant objectives was thus expected to 
produce a more even distribution 
Finally, since detailed appraisal was usually given to areas 
of objectives which were easiest to appraise or in which in- 
struments were readily available, it seemed wise to make 
sure that some of the important intangible objectives for 
which no refined techniques or instruments wore as yet avail- 
able would not be overlooked. 

Generally speaking, then, while the schools attempted to 
evaluate as broad a range of objectives as possible, the actual 
program rested on decisions representing a combination of 


the ideal possibilities and the practical limitations of the 
school situation, 


of emphasis in teaching. 


Collecting Data 


Once the staff agreed on the general scope of the program, 
it considered the methods for securing the needed evidence. 
This required a preliminary survey 


of the data already avail- 
able in the school. Only when the 


faculty had explored the 
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‘possible relationships to school objectives of the data which 
was already collected was it in a position to decide what fur- 
ther data were needed. In the process of clarifying the school 
objectives it was usually discovered that the faculty was al- 
ready collecting many types of data on these aspects of de- 
velopment. Thus, many schools had a testing program in- 
cluding aptitude tests, reading tests, and information tests. 
Most schools also had an abundance of less formal types of 
data collected in the normal process of teaching and adminis- 
tering the school. In most cases these data were put to only a 
limited use, partly because they were scattered, partly be- 
cause of the tendency to consider only the scores on objective 
tests as appropriate evidence, but mainly because their bear- 
ing on the objectives of the school was not evident. 

When, however, the objectives were clarified to the point 
Where teachers could clearly see the concrete behaviors in- 
volved, the bearing on the broader objectives of some data 
Which teachers were collecting for specific purposes became 
apparent. Thus the English teachers found that student writ- 
ing could be examined for evidences of interests, social ad- 
justment, and social attitudes as well as of the ability to spell 
and write correctly. Records of free reading were found to 
Yield evidence on maturity of tastes as well as of quantity of 
reading, Even such simple data as the records of activities 
and Subjects taken assumed significance when considered in 

€ context of other facts about the students. 

This examination of the data already available usually in- 
dicated certain gaps, that is, objectives on which little evi- 

ence was being obtained. Hence, the next step was to plan 
the Ways and means of securing the additional data needed. 
Usually at this point there was a tendency to consider only 
Paper-and-pencil tests. However, a careful analysis of the 
methods of securing evidence most appropriate to each objec- 
tive revealed that the classroom situations provided a far 
Breater source for securing data on students than had usually 
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been assumed. For the appraisal of some objectives, such as 
the ability to plan the attack on research problems, or to a 
laboratory techniques and tools, the observation and singe - 
ing of student behavior in normal classroom situations w ч 
the best if not the only adequate source. Thus, one schoo 
secured data on student growth in planning research by the 
simple device of providing students with pads on which to 
record in duplicate the successive outlines of the plans they 
made. At other points semi-controlled classroom шшш 
suitable for both learning and evaluation purposes, could wel 
be used in place of formal tests. Thus the difficulties pe^ 
tered in getting information from libraries and books coul 
be diagnosed, and in many schools were diagnosed, by giving 
students assignments requiring the use of the library and by 
observing the methods they used in obtaining the necessary 
information. 

These uses of sources of data in processes integral to teach- 
ing were found to be particularly helpful because when teach- 
ers were directly responsible for collecting evidence they 
more often used the results than when only the summary of 
data came to them. However, collaboration and systematic 
allocation of responsibilities on a school-wide basis are neces- 
sary to prevent this method from being too time consuming. 
In economizing effort it was found that certain departments 
or teachers of certain areas were in especially strategic pos 
tions to collect one kind of evidence, while others had greater 
opportunity to obtain information of a different sort. By sys- 
tematizing the use of such inform 
the results generally available, many schools found that they 
could extend the scope of their evaluation through the use 
of opportunities already existing in the classroom. 7 

Having agreed upon the informal methods to be used i? 
obtaining evidence, the next step w. 
formal devices. Usually 
for points where info 


al devices and by making 


as to plan the use of more 
paper and pencil tests were reserve : 
rmation was lacking altogether, О? 
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where the available information was inadequate, or where 
the use of informal methods entailed too much time and 
effort. Thus, most schools had considerable evidence on 
information and skills, but little or no evidence on the 
growth of students in various phases of thinking. The infor- 
mation on social attitudes secured or securable through anec- 
dotal records, classroom observation, or from student papers 
was found to be too scattered and meager to give an adequate 
picture of social beliefs over a range of social issues of im- 
portance. At many points, then, it was necessary to use addi- 
tional paper and pencil tests, either because they represented 
the only appropriate method of getting the evidence or be- 
cause they were most economical. 


Drawing Up a Schedule for Testing 

In setting a calendar for the testing program, it was neces- 
Sary to consider several factors. In the first place, the total 
time devoted to testing could not be so great that students 
and faculty thought themselves overburdened with tests. 
To avoid this difficulty, careful estimates were made of the 
total time needed for taking all tests which were tentatively 
Proposed for the program. Some schools even went so far as 
to set up a time limit and to eliminate certain instruments if 
the proposed schedule exceeded that limit. 

In the second place, the schedule had to be drawn so that 
there was no undue concentration of formal tests toward the 
end of the year, and particularly toward the end of the 
twelfth grade, since such a congestion of schedule subjected 
Students to unnecessary tension, and did not provide evi- 
dence at times when the results could most effectively be 
used, Generally, congestion was prevented by devising a ten- 
tative calendar for the testing program covering all the grades 
of the school. Such a calendar included the repeated use of 
Certain instruments to check on growth as well as the giving 
of certain tests which needed to be used only once. Tests 
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yielding information basic to understanding new students 
and for the initial planning of teaching were usually placed 
early while others were distributed over successive years. 

The schedule also provided for a fair distribution of time 
among the several subject fields so that the testing did not 
take an undue amount of time from any one class. This was 
done by allocating different tests to different departments 
in the school or by staggering the successive periods of the 
day used for giving tests. 

The methods of organizing for this cooperative job varied 
greatly from school to school, depending on the size of the 
school and the make-up of their faculties. In some cases, par- 
ticularly in smaller schools, the school psychologist or coun- 
selor took the major responsibility for drafting the tentative 
plans and for arranging the practical details. In such cases 
much of the participation of the faculty was achieved through 
informal contacts and personal conferences. 

In other schools evaluation committees were established, 
whose responsibility it was to get the necessary information 
and advice from the rest of the faculty, to draw up a plan, 
and to care for the routines. Often members of such commit- 
tees took special responsibility for giving certain instruments 
or series of instruments as well as for collecting certain ma- 
terials from other teachers. 

In still other schools the responsibilities were divided 
among the staff according to the types of evidence to be col- 
lected. Thus a psychologist became responsible for giving the 
psychological tests and reading tests. An evaluation represen- 
tative supervised the use of the special tests developed by the 
Evaluation Staff, while individual teachers were responsible 


for information and skill tests in their respective areas. What- 
ever the particular scheme, it was found necessary to make 
careful, coordinated plans 


for the entir alua- 
dien ntire program of ev 
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Summarizing and Circulating the Results 

Since the evidence of student development was obtained 
from records already existing in the school, from collecting 
data easily obtained as part of the class work, and from espe- 
cially selected tests and appraisal devices, the problem of 
organizing and summarizing these varied types of informa- 
tion was an important one. Part of the task of organization 
was accomplished by using a folder for each student, and 
placing all records relating to this student in this folder. The 
student folder became a file of information to which addi- 
tions were made as the evidence accumulated. 

However, the varied forms of evidence made it necessary 
to utilize additional techniques of organization. The test 
Scores were already organized into patterns devised by the 
evaluation committees. In the case of data recorded by stu- 
dents or parents, such as entrance information, reading rec- 
ords, and written papers, the administrative problem was to 
Organize the record keeping in such a way that a consistent 
and cumulative record became available. Thus, in case of the 
reading records, a certain time each week was allotted to stu- 
dents to write down the books they had read during the pre- 
ceding week. Copies of written work were assembled in the 
student folder. 

To obtain satisfactory records from observations made by 
teachers or other persons in a position to observe students in- 
volved several other administrative problems. Chief among 
these was that of obtaining observed facts on behavior, in 
Place of ratings drawn largely from memory. Some organiza- 
tion was also needed to obtain a sufficiently representative 
Sampling of the observations from different teachers, sup- 
posedly in a position to see the student in different situations, 
Staff conferences devoted to clarifying the behavior to be ob- 
Served and the techniques of obtaining the record most 
economically, and then to periodic discussion of records sub- 
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to assume the task of summarizing this evidence and of ү 
ing these summaries along. Furthermore, they were expecte 
to be most familiar with the tests relating to their objectives, 
hence they were usually expected to give these tests and to 
summarize the most pertinent points revealed in the test 
scores. If some other members of the staff, such as the psy- 
chologist, the counselor, or the evaluation representative, 
were responsible for parts of the testing, they assumed the 
responsibility for summarizing the results of the tests they 
gave. 

These summaries of various items of data about a student 
were then brought together by the person mainly responsible 
for his guidance, usually his homeroom teacher or counselor. 
This person was responsible for making an over-all interpre- 
tation of the data, indicating the outstanding strengths and 
weaknesses, pointing out some factors contributing to these, 
and making some tentative suggestions regarding what 
needed to be done. Until this step was taken, one teacher 
knew about his language skills, another about his social atti- 
tudes, another about his techniques of thinking, another 
about his interests, but no one had a coherent picture of his 
development. Hence, few teachers were in a position to make 
comprehensive suggestions regarding what the student 
needed, or able to take constructive action. 

While the summaries of specific data were usually made 


at the time when the evidence was secured and when the cir- 
cumstances of securing it and its implic 


over-all interpretations were made on] 


tervals or at times when such inform 
That is, these interpret 


when reports to paren 
ticular curriculum pla 
the case of an indivi 
terpretation of his 
made these over-all 


ations were fresh, the 
y at certain regular in- 
ation was most needed. 
ations were usually made at the times 
ts were being prepared and when раг” 
ns were being made. From time to time 
dual student might require a special in- 
record. The members of the staff who 
interpretations usuaily had some insight 
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into the psychological implications of behavior, some train- 
ing in the interpretation of these types of data, and some per- 
sonal contact with the students. In order that the data be 
actually used, it was found to be extremely important that 
all data on the growth of a student pass through the mind of 
a person who knew him and had a responsible relation to his 
all-round development. 

The schools found it necessary to develop plans for circu- 
lating information as well as summarizing it. Usually the 
basic data collected by each teacher remained in his posses- 
sion as long as he needed it. The summaries were, however, 
circulated as soon as they became available. This was done 
either by exchange of notes or by frequent meetings of small 
groups of teachers and advisers of each group of students. 
The latter method was most commonly adopted by schools 
Where some form of core or unified curriculum was in force, 
in which case a small group of teachers was responsible for a 
major portion of the school experiences for a given group of 
Students, 

To facilitate still further the circulation of information, the 

asic files were placed in spots accessible to teachers and 
Counselors. If there was a school counselors’ office, the files 
Were placed there. If teachers acted as counselors, their re- 
Spective classrooms or offices contained those files. The main 
Principle was to keep the records of students where they were 
most frequently used. Several copies were made of data 
Which were needed in different places or by different people 
at the same time. Thus, often the basic entrance data were 
available in teachers’ or counselors’ folders as well as in the 
Principal’s office. 

A somewhat different problem was involved in handling 
Тоир data. It must be recalled that all data pertinent to in- 
dividual growth could also be summarized so as to give evi- 

€nce about the strengths and weaknesses common to groups 
of students, These group summaries were particularly useful 
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in appraising the effectiveness of the curriculum. Since the 
summarizing of group data requires a certain degree of statis- 
tical competence and since, furthermore, the analysis of these 
data involves comparative study of data on all groups in the 
school, these tasks were usually in the charge of a person or 
a committee responsible for coordinating the curriculum 
program in the school. It was the responsibility of this person 
or committee to analyze and summarize the data and to re- 
port periodically to the faculty on the effectiveness of the 
school program in achieving its major objectives. 

The processes involved in interpreting group data have 
been described in the previous chapter. The chief administra- 
tive arrangement required was to provide time for the staff 
to meet together regularly to study the data, bringing to bear 
upon it the specialized competence and points of view of a 
representative sample of all departments in the school, and 
for cooperative planning of teaching. This time was usually 
secured by a more careful rearrangement of schedules and 
teacher responsibilities. A few schools reduced the total 
teaching period of the day by having students come half an 
hour later. In a great many cases teacher time was saved by 
teaching students to work independently and thus dispensing 
with teacher supervision at some points of their work. 

Using Evidence for Improving Teaching 

Availability of evidence alone, no matter how well or- 
ganized and summarized, did not assure its effective use. The 
implications of the individual and group data to daily pro- 
cedures in guidance, teaching, and curriculum making had to 
be intelligently digested by every teacher before the greatest 
value of the evidence was attained. It was necessary to make 

` special provisions for teachers to develop the insight and 


techniques needed to translate into practice what was learned 
about the students. 


and Curriculum 


Definitely scheduled opportunities to study the data was 
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one of these special provisions. To make maximum use of 
evaluation evidence in teaching and guiding students and in 
curriculum improvement was found to require continuous 
study and collective thinking by the whole staff. Occasional 
reports to the staff about the results of the evaluation pro- 
gram proved inadequate for this purpose. At best these occa- 
sional reports served only to acquaint the staff with the fact 
that something could be learned from the evaluation pro- 
gram. Similar limitations were found with occasional case 
study conferences regarding individual students. The occa- 
sional conferences introduced the staff to the techniques of 
analyzing evidence about individuals and suggested some 
Possible implications, but they did not provide adequate 
Opportunity for the staff to explore multiple explanations and 
to consider various constructive modifications in daily prac- 
tices which were implied by the evidence. 

A second provision was to see that the staff explored the 
evidence and its implications at those points where decisions 
Were to be made and actions to be taken. When discussions 
of evaluation data took place apart from any need for action, 
they were often received by the staff with the passivity 
usually accorded to academic discussions and often regarded 
merely as an interesting theory. In many cases, what the staff 
Seemed most to need was a clear demonstration of the help- 

ulness of the information to the teachers’ ongoing activities. 
t was found to make an enormous difference in the attitude 
9f the faculty toward evaluation data whether the data on 
à given student were just "studied" or whether they were 
Introduced at a time when the staff was concerned with such 
questions as what to do about certain students’ lack of suc- 
“ess in academic work or apparent failure to adjust to the 
ife of the school. Similarly, when such questions as the use- 
ulness of Greek history for the non-academic students or 
the advisability of social mathematics for those failing in 
regular mathematics were raised, the evidence on the success 
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or failure of these groups in achieving various objectives 
assumed a greater significance. Not only were the implica- 
tions of available evidence scrutinized more carefully, but 
the possibilities of constructive action were also considered 
more thoughtfully when the attack was made in terms of 
definite problems to be solved. 

There were several occasions in the typical school pro- 
cedure which proved to be particularly appropriate for 
demonstrating the usefulness of evaluation data and for in- 
itiating teachers into the habit of basing their decisions and 
practices on whatever evidence was available. Making out 
programs for the students for the year was one such occasion. 
Often, student programs were decided on the basis of such 
factors as: convenience of the time, college requirements, 
previous success or failure in various subjects, and the stu- 
dent’s own wishes. When a fairly comprehensive set of evalu- 
ation data became available an attempt was made to reach 
these decisions in the light of all available data about the 
student. Frequently, also, the program making was done 
cooperatively by a faculty group in charge of a group of stu- 
dents. Such conferences served not only as a means of ac- 
quainting the teachers with what was in the “records,” but 
also to clarify and unify the guidance policies of the school 


and as a means of initiating a habit of making decisions of all 


sorts in terms of evidence rather than in terms of previous 
practice or of unconsidered personal preferences. 


Reports to parents offered another occasion to study the 
growth of students, to consider their needs, and to initiate 
the habit of making judgments in terms of evidence. Many 
teachers had felt at a loss in finding a sufficient number 0 
valid things to say to each parent about the students, Exami- 
nation of objective evidence proved to be very welcome at 
such times. 

Most of the schools also had to consider from time to time 
certain changes in the curriculum. This afforded another 
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occasion for studying evaluation results. These suggested 
changes ranged from the proposal to add new courses to the 
possible reorganization of the whole structure of curriculum 
offerings. These occasions were an opportune time to survey 
the effectiveness of the curriculum in terms of available evi- 
dence. Several of the Thirty Schools began with occasional 
Staff meetings considering such problems. They proved so 
useful that curriculum planning sessions held regularly 
through the year became a frequent practice. Many schools 
held prolonged sessions either in the spring after the school 
Was out or in the fall before the year's program was begun. 
At this time the information about the growth of students 
toward all objectives of the school was carefully examined 
by the whole staff, and the curriculum plans as well as plans 
for teaching and special activities to be promoted were made 
in terms of that evidence. Weekly conferences throughout the 
School by smaller groups of teachers dealing with the same 
group of students were also a frequent practice. 

A third provision was to involve the entire school staff in 
the study of the results of evaluation. Often consideration of 
the implications of the evaluation evidence suggested changes 
M practices which were not under the direct control of any 
one member nor any small portion of the staff. For example, 
m many cases the sources of difficulties in achieving con- 
Sistent democratic attitudes appeared to be in the whole or- 
Sanization of the school, the weaknesses in clear thinking 
Were apparently produced by an inconsistent approach 
among the different teachers, and adjustment problems could 
argely be traced to the way in which the program of student 
Activities was organized. To uncover difficulties of this sort 
and to plan constructive remedies, it was necessary to take 
the whole staff into partnership in considering and formulat- 
Ing school policies and in examining evidence helpful in 
making wiser decisions. 

As the evaluation program proceeded, it became increas- 
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ingly clear that to be effective it must involve extensive par- 
ticipation by the entire faculty. Teachers had to formulate 
objectives and to agree on the common objectives of the 
school. They had to select certain manifestations of growth 
toward these objectives which could be tested, observed, or 
recorded. While in the choice of instruments technical ad- 
vice was needed, the final decision regarding their appropri- 
ateness rested with the teachers. Similarly, the final decisions 
regarding what was significant in the evaluation data and 
how they could be used in improving school practices could 
wisely be made only by those who were carrying on the job. 
When judgments and decisions of this sort were made by 
“experts” and passed on to the teachers, the results were less 
fruitful. 

A program which involved wide participation naturally 
raised the question of the competence of the rank and file 
of teachers in such matters. Thus, for instance, the ability of 
teachers to interpret properly evaluation data, particularly 
those requiring psychological insight or technical manipula- 
tion, was questioned. The usual assumption, for example, was 
that only statistically trained people could be trusted to deal 
with test scores. The experience in the Thirty Schools was 
that on the whole teachers made better interpreters than 
persons statistically qualified but whose personal contact 
with students was limited. Moreover, since it seemed evident 
that unless teachers were trained to interpret evaluation data 
for themselves, their ability and insight in using the results 
as well as their willingness to do so would remain limited: 
the schools in cooperation with the Evaluation Staff em- 


barked on the job of training the teachers for this work 
Periodic conferences on interpretation were held in each 
school. To provide for mor 


€ continued help, in each school 
one person was chosen to act as an evaluation representative 
and as an evaluation adviser to the rest of the school. This 
person spent some time receiving training in interpretatio? 
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either in workshops during the summer or with the Evalua- 
tion Staff during the school year. 

Similarly, the use of evaluation data in shaping an im- 
proved school program could not be left to accidental or 
amateurish efforts. Some training and guidance of teachers 
was needed. This did not mean that all teachers were packed 
off to summer school to receive such training. Participating in 
planning and administering the evaluation program and in 
the study and application of its results in itself provided an 
opportunity for training hardly exceeded by any other de- 
vice, provided there were opportunities in the school for the 
Staff to think together on these matters and to make coopera- 
tively decisions which had previously been made by in- 
dividuals, 

This brief report on the planning and administration of an 
evaluation program provides a further illustration of the 
Ways in which the evaluation project was an integral part 
of the processes of teaching, of curriculum making, of guid- 
ance, and of teacher education in many of the Thirty Schools. 
AS a result of its work with the schools, the Evaluation Staff is 
Convinced that a program of evaluation can achieve its maxi- 
mum usefulness only when it is an integral part of the major 
tasks of the school. Deriving its direction from the major 
Objectives of the school, the evaluation program helps to 
clarify these objectives into clearly apprehended goals and 
Purposes which are more effective guides to teaching and 
Counseling, Exploring each major objective to identify types 
of behavior manifestations which will serve to reveal the 
Progress of students toward this objective helps to focus at- 
tention upon the learner and the meaning of the educative 
Process to him, Studying the results of evaluation serves to 
identify strengths and weaknesses of teaching and inade- 
Quacies in the school program. Effective participation in 
these Several phases of evaluation serves as a stimulating ex- 
Perience for teachers in their own continuing education. 


PART II 


RECORDING FOR GUIDANCE AND TRANSFER 
KEKE ССС 
The work of the Committees on Records 


and Reports, and the Forms produced 
by them. | 
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groups, for example, studied the objectives of teachers and 
schools, but always in relation to particular problems, and 
always with the results obtained by other groups available 
for comparison and use. The list of objectives prepared by 
the Evaluation Staff was particularly helpful to all the com- 
mittees on recording and reporting. 

All record forms that can do so provide space for informa- 
tion of the kinds obtained by the Evaluation Staff, so that 
this can be related to the other data and so can help to 
make a more complete description of the pupil. 

Although it will be said again in relation to various forms 
and their use, it must be emphasized here that no single 
result of evaluation procedures or of observations recorded 
on the forms is considered to be independent of other in- 
formation about a pupil. All the information obtained, as 
would be true if he were studied by a psychologist oT 
psycho-analyst, contributes to the more complete understand- 
ing of him that becomes the basis for the school’s dealing 
with him. 
Philosophy and Objectives 


The original Committee on Reports and Records consid- ; 
ered with great care former methods of recording facts about 
personal characteristics or traits, and the words used in de- 
scribing and reporting about them. 

Out of this study an 
ing the committee ca 
governed the later w 


implicit form was reexamined by the other committees, ап 
was generally accepted as a guide, though it was realize 
that some of it applied most completely to the study of pe™ 
sonal characteristics, 


GEN 


1, Th vee uu f 
юн € purpose of recording is not primarily that 0 


ing. Instead the fundamental reason for records i 


ERAL PURPOSES AND РНПО$ЗОРНҮ оғ RECORDING 
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their value as a basis for more intelligent dealing with 
human beings. 

The first purpose of records is therefore that of form- 
ing a basis for understanding individuals so that effective 
guidance can be given. 

(b) Since the educational process is a continuous one that 
should not be set back at certain transfer. points, it becomes 
necessary that guidance shall continue across such points 
in such a wav as to increase the probability of continuity 
in dealing with the person. 

An extended purpose of records hence becomes that of 
furnishing transferable information for guidance. 

(c) Because of the need of cooperative and consistent 
dealing with a boy or girl by home and school, as well as 
the right of the home to information as complete and reli- 
able as possible about progress and development, records 
should a ee i which reports can be founded, 
and reports should be considered an essential and consistent 
Part of the recording system. 

A third purpose of record keeping is therefore to provide 
Ne information needed for reports to the home, and to add 
effective ways of giving such information. 

(d) Information is needed at all stages of education, and 
Particularly at points of transfer from one institution to an- 
other, or from an institution to employment, in order that 
Qualifications of the individual for the new experience can 
be fairly judged. | 
A fourth purpose of record keeping is therefore to pro- 
vide information, and methods of transferring it to others, 
that win give evidence regarding a pupils readiness for suc- 
ceeding experiences. This would apply to fitness for a par- 
ticular college or other institution. 

What might be considered an indirect but nevertheless 
Important purpose of records is that of stimulating teachers 
? Consider and decide upon their objectives, judge some- 
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thing of the relative importance of their aims, and estimate 
their own work and the progress of their pupils in relation 
to the objectives chosen. 

Many teachers think almost entirely in terms of the most 
obvious objectives concerned with the learning of subject 
matter and evaluate their results only in terms of such aims. 
They give little or no consideration to the changes in their 
pupils that should come about as a result of the experiences 
undergone, and so they fail to bring about the development 
that is possible. Through well planned records they can be 
helped to a wider vision and a more constructive influence. 

It is evident that the most valuable and complete record 
that could be made by observation of an individual would 
consist of a record of his behavior throughout life, or that 
portion of it under observation. It is believed that any ob- 
Servational technique that has value must consist in using 
the parts of such a record that can be collected and arranged 
in the time at a teacher’s disposal. This can be done by re- 
cording significant incidents of behavior and interpretations 
of them (the “anecdotal” method), by characterizing in one 
way or another the kinds of behavior observed (sometimes 
called “behavior description” ) 
acterization and of su 
form 


, or by a combination of char- 
pplementary analysis in paragraph 


Where a teacher deals with a sm 
carries a light schedule, the recordi 
material seems possible and highl 
tions and teachers use such a meth 


all number of pupils, ОТ 
ng of extensive anecdota 
y valuable. Some institu- 


= od even when the writte? 
material cannot be extensive. The more the demands on the 
teacher through appointments or pupil load, the less is it 
possible to write voluminously, and the dne does it sec 
necessary for each instructor to digest his observations int? 
quickly recorded (but not too quickly arrived at) judgments 


about the typical behavior of the pupils. No “checking” SY" 
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tem, however, can fit all of the significant differences among 
people, no matter how well it is devised, so such a system 
must allow for supplementary notes that modify or add com- 
pleteness to a description. 

As this committee was trying to devise a method and 
blanks for recording facts about a pupil in abbreviated form, 
it was necessary to agree upon working objectives for pro- 
ducing the kind of forms that would serve the purposes de- 
sired. The following objectives were used. 


WORKING OBJECTIVES FOR RECORDS AND REPORTS 
J 


l. Any form devised should be based on the objectives of 
teachers and schools so that a continuing study of a pupil 
by its use will throw light on his successive stages of devel- 
Opment in powers or characteristics believed to be important. 

2. The forms dealing with personal characteristics should 
3e descriptive rather than of the nature of a scale. Therefore 

marks” of any kind, or placement, as on a straight line 
representing a scale from highest to lowest, should not be 
used, 

3. Every effort should be made to reach agreement about 
the meaning of trait names used, and to make their signifi- 
Cance in terms of the behavior of a pupil understood by those 
reading the record. 

4. Wherever possible a characterization of a person should 
be by description of typical behavior rather than by a word 
Ог phrase that could have widely different meanings to dif- 
erent people. 

- The forms should be flexible enough to allow choice 
of headings under which studies of pupils can be made, thus 
allowing a school, department, or teacher to use the objec- 
tives considered important in the particular situation, or for 
the particular pupil. 

6. Characteristics studied should be such that teachers will 
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be likely to have opportunities to observe behavior that gives 
evidence about them. It is not expected, however, that all 
teachers will have evidence about all characteristics. 

7. Forms should be so devised and related that any school 
will be likely to be able to use them without an overwhelm- 
ing addition to the work of teachers or secretaries. | 

8. Characteristics studied should be regarded not as inde- 
pendent entities but rather as facets of behavior shown by 
a living human being in his relations with his environment. 

This last objective is a fundamental one. It has been ob- 
served in the work on both evaluation and recording, and 
must be kept in mind in considering whatever has been pro- 
duced. The one great danger in the use of any forms that 
offer opportunity for recording facts about people is that 
those who use them may revert to the idea of “marking, 
using the material on the forms as a scale for rating, instead 
of as an abbreviated basis for description of the person’s be- 
havior in some area or under some conditions. The various 
record forms too should be considered 
each other so as to give a more complete description of the 
individual than a single form could present. А 

It should be emphasized that no form produced in this 
study is believed to be final, or to be the only kind of form 
for its purpose. Because of the generosity of the contribut- 
ing foundations and the willingness of the committee mem- 
bers to give their time and effort, a more 
tensive study of recording has been made than had been 


possible before. There is reason to hope, therefore, that these 
forms may prove suitable for m 
in view of their wide flexi 


as supplementing 


extensive and in- 


and the material develope 
d trial, though the members 
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are far from being dogmatic about the form or content of 
what is now offered. 

While that which has been done by these committees 
represents the most organized work accomplished in record- 
ing and reporting, since it involves the cooperation of those 
in many colleges and schools, the achievements of various 
of the cooperating schools working individually in devising 
forms to fit their own particular needs also deserves mention. 
Committees of faculty members studied the conditions and 
needs of their institutions and arrived at interesting and 
valuable methods of collecting and recording information 
about their pupils. 

It is obviously impossible to reproduce and discuss the 
forms produced by such efforts, but other schools may profit 

Y consultation with cooperating schools whose problems 
Seem similar to their own. 


Chapter X 


BEHAVIOR DESCRIPTION 
EEE EEE GC EEKE KEKE KEKE 


Much of the foregoing philosophy was developed while the 
Committee on Records and Reports! was making a prelimi- 
nary study of its first record-making assignment, which con- 
cerned the study of personal characteristics. This study 
began with exploration of what had previously been done 
in this field. The committee found many attempts to clarify 
and organize the study of human beings, with little agree- 
ment on the terms used or the methods employed. From the 
great number of people-describing words in the language, 
however, certain ones had attained somewhat common 
usage. The first survey of terms used by various agencies to 
describe people produced over 150 terms, and a later study 
made by Dr. Rothney listed over 260 trait names. 

All of these words were considered and compared. It was 
found that they fell into sets, each set composed of words 


having somewhat the same meaning, so that the number of 


markedly distinct characteristics was only a fraction of the 


number of names of traits. Each set was considered by itself 


* COMMITTEE ON BEHAVIOR DESCRIPTION. (Members and_ those added 
during the work. Institutional affiliations are those for the time of appo 
ment.) Miss Helen M, Atkinson, Horace Mann School for Girls; E. Gordo 
Bill, Dartmouth College; Carl Brigham, Princeton University; Oscar. 
Buros, Research Assistant 1933-35, Rutgers College; Mrs. Cecile Fleming 


Horace Mann School; Mrs, Anne Rose Hawkes, The Carnegie Foundations 
Miss Frances Knapp ( Н 


deceased), Wellesle College; Robert D. Le! 
Bennington College; William S. Learned, The бутса Foundation; joi 
Lester, The Hill School; Rollo Reynolds, Horace Mann School for Git Й 
Eugene R. Smith, Chairman, The Beaver Country Day School; John Ti А 
sley, Associate Superintendent of Schools, New York City; Ben Wood, e 
operative Test Service; Stanley R. Yarnall Germantown Friends Sch00^ 
John W. M. Rothney, Research Assistant, "Secretary, Harvard University” 
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until the committee members agreed on the term or terms 
that best expressed its fundamental meaning. From the re- 
sulting list of key words the first group of characteristics was 
chosen for further study. 

The criteria for choosing the characteristics to be used 
were: 

l. Importance. The ones chosen should be worth observ- 
ing because they throw light on the person being studied. 

2. Observability. They should be such that some at least 
of a pupil's teachers will have opportunity to observe sig- 
nificant behavior in relation to them. 

3. Completeness. Taken together they should give a rea- 
Sonably complete picture of the person as seen by the adults 
dealing with him. 

4. Differentness. They should be sufficiently independent 
$0 that teachers can distinguish between them and so that 
intercorrelations will not be too high. 

From the beginning, the members of the committee were 
agreed that the evidence from research did not justify a 
method of rating, or any type of scale for judging personal 
Characteristics, such, for example, as one constructed along 
а straight line, or one composed of named points with sup- 
posedly equal intervals between them. It questioned the use 
9f undefined terms for designating degrees of excellence or 
ack of it, and believed that it was unlikely that intervals on 
а line or other scale had any accuracy in terms of their rela- 
tive size or importance. . 

Furthermore the committee was not much interested in 
a scale even if it could have been constructed. It hoped 
rather for something that would encourage and help teachers 
to observe and analyze behavior and from the evidence ob- 
tained to reach a better understanding of their pupils as liv- 
ing functioning human beings. 

he members, as has been shown, were definite in their 
desire to eliminate comparisons except as they were implicit 


472 ADVENTURE IN AMERICAN EDUCATION 


in any descriptive material. They therefore set as their goal 
a form that: 


1. at the time of a single use of it, would, through de- 
scriptions of behavior, present a picture of a person 
not only in terms of his commonest (modal) be- 
havior, but also in terms of the range and variety 
of his behavior under different conditions; 

2. over a period of time would, through a series of 
studies and recordings, constitute a record of devel- 
opment in significant characteristics. 


It would be difficult for any one who had not worked on 
such an undertaking to realize the difficulties encountered. 
The members of the committee covered a wide range of ex- 
perience and specialization that naturally influenced their 
ideas of the work of the committee and their conceptions of 
the use and meanings of certain terms. Some, at the begin- 
ning, were even skeptical about the possibility of working 
out anything of value. Frequently hours had to be spent in 
the discussion of the meanings of a few words whose use 
seemed necessary. Little by little, however, techniques of 
work developed and language difficulties became fewer. The 


final form includes only material on which the committee 
reached substantial agreement. 


The committee’s first achievement w 
group of characteristics and the develo 
recording behavior in terms of them. A manual for teachers 
was also written. The cooperating schools were asked to 
study groups of pupils by the use of the forms and manual, 
to send the completed blanks to the committee, and to make 
suggestions for revisions, Mr. Oscar K. Buros of the research 
staff and an assistant Studied the results and from their 
analysis made suggestions for changes. A blank and а тап” 
ual incorporating the revisions decided on was next pre 
pared. After rather a large number of pupils had been studie 


as the choice of 4 
pment of blanks for 
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the intercorrelations among descriptions were worked out at 
Columbia University under the direction of Dr. Ben Wood. 
The figures showed that either some names of traits con- 
veyed too nearly the same meaning to teachers despite the 
committee’s attempt to differentiate their meanings, or else 
certain characteristics were so closely related that they 
tended to appear in similar ways in many situations. In either 
case the aims for the undertaking were not being achieved. 

The committee made the changes indicated. It also added 
to the scope of the information asked for, since some valu- 
able facts seemed to be omitted, and rewrote and enlarged 
the manual. Further trial, experimentation and testing re- 
sulted in still further changes, though with less radical al- 
terations in later steps. Eventually a considerable body of 
material was again submitted to correlation study, this time 
at Harvard University by Dr. Rothney. This study found 
that the characteristics were sufficiently different and the 
judgments of the teachers sufficiently well made so that the 
reports were significant descriptions of the pupils. 

Even after this the committee again called for criticisms 
and suggestions from the schools and tried to refine its work. 
Tt is hoped that it is now in such form that it will have value 
to schools in general, and perhaps to more advanced institu- 
tions.? 

It will be clear from the material itself that the method 
of studying pupils devised by the committee depends on 
the supplying of descriptions of the different kinds of be- 
havior that are likelv to be observed in relation to the char- 
acteristics chosen. The descriptions made by the committee 

?The form, modified in its text material only by the addition of two 
characteristics and two additional questions was used by Dartmouth Col- 
ege in "The Dartmouth Visual Survey of The Dartmouth Eye Institute." 
It is said to have served the purpose of the study successfully. The cards 
have also been used to study the students in a college dormitory. It is 


likely that a form planned especially for college use will eventually be 
Published, 
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are designed to define what might be called types or classifi- 
cations of behavior in terms of each characteristic. The use 

of carefully worded standard definitions in place of teachers 

own wordings is intended to bring about a more nearly ue 
mon understanding of the characteristics themselves and a 

the persons described. The form resulting also € 
greatly the time required for recording and for using the 
record for purposes of interview or transfer. 

In general, all teachers having opportunity to sue a 
pupil would be expected to describe him by the use of this 
material. The combined reports, which would appear on the 
Behavior Description card, would show the pupil's most 
common behavior, as well as the range of behavior under 
different conditions, | 

It is recommended that the descriptions be recorded twice 
a year through the six years of junior and senior high school. 
To the degree that the information covers this period the 
card becomes a record not only of what a pupil is like at 


any one time but of his many-sided development through 
this period of his growth. 


Use or Record Canps 


To show the manner in which the classifications are con- 
, . : : F 1 
sidered and used in school practice the entire section O! 


à x d s е е 
“Creativeness and Imagination" is quoted here from th 
manual. 


CREATIVENESS AND IMAGINATION 
NOTE: The question whether wh 
before does not enter into this di 
in question, and the extent of th 
determine the amount of creati 
not only originating entirely, 
and seeing new relationship 
toward creativeness are: 


at is created has been created 
scussion. Newness to the perso? 
e contribution he himself makes» 
veness shown. Creation includes 
but also recombining old elements 
s. Some characteristics that te”? 
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the desire and habit of trying new things, of putting things 
together in new combinations (experimentation), 


the ability to think new things, an art form, a melody, a new 
concept, a new situation (imagination ), 


the ability to organize, direct or control new combinations 
g 
of people or things (executive manipulation ). 


TYPE ІА. General: those who approach whatever they do with 
active imagination and originality, so that they contribute some- 
thing that is their own. 
TYPE ІВ, Specific: those who make distinctly original and signifi- 
cant contributions in one or more fields. 

Discussion: For secondary school pupils this might occur in 
Writing, the fine or applied arts, music, drama, or research in 


Scientific or other fields. 


Examples: One may show the possession of this trait by: 

1. Expressing one’s emotions and thoughts through such 
media as language, arts and crafts, music, or drama. 
This might result in the writing of poems, stories or essays, 
in the conception and execution of pictures, statues, cos- 


tumes, or stage sets, or in one or more of various other 


such expressions. . 
2. So expressing an old idea that it is reinterpreted through 

a new viewpoint or a different organization of material, 
with such imagination that he sees 


З. Using logical processes 
nships that open new fields of 


implications and relatio 


thought or throw light on old ones. 
4. Bringing to the planning and activities of the day think- 


ing and action that result in improved procedures. This 
might appear in the formulation and carrying out of a 
procedure for study investigation, the accomplishment of 
a task, or the manipulation of a group. 

5. So completely projecting oneself into a situation that it 
becomes his own. One can listen creatively to a sym- 
phony, or can interpret with originality the one whose 
part he plays in dramatization. 
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6. Combining elements (as in an invention) to produce a 
new result or improve a procedure. 


TYPE п. Promising: those who show a degree of creativeness that 
indicates the likelihood of valuable original contribution in some 
field, although the contributions already made have not proved 
to be particularly significant. 

Discussion: This includes those who show imagination and ap- 

proach their problems creatively, although—perhaps because of 
lack of experience or of Opportunity in the fields in which they 
will eventually contribute—they have as yet shown indications 
rather than demonstrated accomplishments, 
TYPE Ш, Limited: those whose general attitude shows the desire 
to contribute their own thinking and expression to situations, but 
whose degree of imagination and originality is not in general high 
enough to have much influence on their accomplishments. 

Discussion: A person of this type may make occasional con- 
tributions of some general value where particular experience or 
other favorable influences make this possible, or may from time 


to time show originality in details rather than in general situa- 
tions, 


TYPE 1v. Imitative: those 
tive contributions themselves, 1 


words “General,” “Specific,” and so forth are key words de- 


fining each type of behavior as well as one or two words 
could be found to do it, 


Under some characteristics two t 
ber but with letters added, 
This occurs where th 


lated types of behavi i i ау in which 


; the character- 
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istic in question. Both IA and IB in this example indicate a 
highly creative approach to problems on the part of those 
they describe, but of two listed under these definitions the 
one under IA might be thought of as applying his creative 
ability more extensively, while the one described by IB 
would respond less generally, but quite possibly with equal 
or greater intensity, to the particular stimuli that do arouse 
his creativeness. 

| The Behavior Description card, because of its size, which 
is that of a filing envelope for an 8%” by 11” file, cannot 
easily be shown in this volume. It is possible however to de- 
Scribe what is most-significant about it. It consists of: 

1. A listing of characteristics and the descriptive clas- 
sifications under them. 

2. Spaces opposite the classifications that make it pos- 
sible to include on the card the study of a pupil 
over the six years of junior high school and senior 
high school, or over the seventh and eighth grades 
and the four year secondary school. 

8. A key system for use in recording the judgments of 
teachers, This will be illustrated later under “Respon- 
sibility-Dependability.” 


х é, » 
4. A considerable space for “General Comment. 


| The entire list of characteris 
tions of types of behavior follows as it appears on the filing 
card. * 


tics that use defined descrip- 


RESPONSIBILITY-DEPENDABILITY 

Type 
RESPONSIBLE AND RESOURCEFUL: Carries through whatever is 
undertaken, and also shows initiative and versatility in 


accomplishing and enlarging upon undertakings. 1 
SONScENTIOUS: Completes without external compulsion 


whatever is assigned but is unlikely to enlarge the scope 


of assignments. 
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GENERALLY DEPENDABLE: Usually carries through undertak- 
ings, self-assumed or assigned by others, requiring only 
occasional reminder or compulsion, . 

SELECTIVELY DEPENDABLE: Shows high persistence in under- 
takings in which there is particular interest, but is less 
likely to carry through other assignments. 

UNRELIABLE: Can be relied upon to complete undertakings 
only when they are of moderate duration or difficulty 
and then only with much prodding and supervision. 

IRRESPONSIBLE: Cannot be relied upon to complete any 
undertaking even when constantly prodded and guided. 


CREATIVENESS AND ІмаАСІ NATION 


GENERAL: Approaches whatever he does with active imag- 
ination and originality, so that he contributes some- 
thing that is his own. 

SPECIFIC: Makes distinctly original and significant contribu- 
lions in one or more fields. 

PROMISING: Shows a degree of creativeness that indicates the 
likelihood of valuable original contribution in some 
field, although the contributions already made have not 
proved to be particularly significant. 

LIMITED: Shows the desire to contribute 


his own thinking 
and expression to situ 


ations, but his degree of imagina- 
tion and originality is not in general high enough to 
have much influence on his accompli 


accomplishments, 
UNIMAGINATIVE: Has gi 


INFLUENCE 
CONTROLLING: His influence hal 


activities, or ideals of his 
CONTRIBUTING INFLUEN 


bitually shapes the Opinions, 
associates, 


CE: His influence, while not control- 


Type 


3A 


3B 


1A 


1B 


о 
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Type 


ing, strongly affects the opinions, activities, or ideals of 
his associates. 

VARYING: His influence varies, having force when particular 
ability, skill, experience, or circumstance gives it op- 
portunity or value. 

CooPERATING: Has no very definite influence on his associ- 
ates, but contributes to group thinking and action be- 
cause of some discrimination in regard to ideas and 
leaders. 

Passive: Has no definite influence on his associates, being 
carried along by the nearest or strongest influence. 


IxQuimuxc MIND 


GENERAL: Responds with consistent, active, and deep interest 
to any intellectual stimulus and uses to good advantage 
various sources of information. 

SPECIFIC: Responds with consistent, active, and deep interest 
only to stimuli arising in specific fields or problems. 
Uses effectively the sources available for such purposes. 

LIMITED: Somewhat sensitive to stimuli arising from limited 
fields, but engages in exploration and investigation only 
when a general plan of attacking the problem is indi- 
cated to him. 

DIRECTED: Responds to stimuli in a limited field of interests 
but is impelled to act only when both the plan and the 
details of procedure are definitely outlined for him. 

UNRESPONsIVE: Rarely seems to be sensitive to any intellec- 
tual stimulus and shows little or no ability to use the 
tools and methodology of exploration and investigation. 


OPEN-MINDEDNESS 


DISCRIMINATING: Welcomes new ideas but habitually sus- 
pends judgment until all the available evidence is ob- 
tained, 

TOLERANT: Does not readily appreciate or respond to oppos- 
ing viewpoints and new ideas, although he is tolerant 
of them and consciously tries to suspend judgment re- 
garding them. 


2 
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Type 


PASSIVE: Tolerance of the new or different is passive, arising 
from lack of interest or conviction. Welcomes, or is in- 
different to, change, because of lack of understanding 
or appreciation of the new or of that which it replaces. 

RIGID: Preconceived ideas and prejudices so govern his think- 
ing that he usually ends a discussion or an investigation 
without change of opinion. 

INTOLERANT: Is actively intolerant; resents any interference 
with his habitual beliefs, ideas, and procedures. 


Tur Power AND HABIT OF ANALYSIS; Tur HABIT or 
REACHING CONCLUSIONS on THE BASIS OF 
VALID EVIDENCE 

HIGHLY ANALYTICAL: Habitually makes an analytical ap- 
proach to his problems, assembling the facts, showing 
a clear perception of their relationships and implica- 
tions, and thinking through the situation to well founded 
conclusions, 

INCOMPLETE: Makes an intelligently analytical approach to 
his problems but is more limited in ability to assemble 
the facts completely, and to see their relationships or 
their implications. 

IRREGULAR: On occasion shows unusual 
does not do so habitually, 


UNDEVELOPED: Shows signs of analytical power, but because 
of fears, the domination of others, or some other inhibit- 
ing agency, has not yet developed it to any high degree. 


LIMITED: Is able to pursue reasoning processes if aided by 
some guidance and direction, 


PASSIVE: His approach to a problem is not 
though he may be able to appreciate 
ing or to follow one laid ou 

UNREASONING: Seems unable to 


analytical power but 


an analytical one, 
a train of reason- 
t by some one else. 

analyze even a fairly simple 
rely on memory as a substi- 


сл 


3A 


3B 


е eae 
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SoctaL CONCERN Type 
GENERALLY CONCERNED: Shows an altruistic and general social 
concern and interprets this in action to the extent of his 
abilities and opportunities. 1 
SELECTIVELY CONCERNED: Shows concern by attitude and ac- 
tion about certain social conditions but seems unable to 
appreciate the importance of other such problems. 2 
PERSONAL: Is not strongly concerned about the welfare of 
others and responds to social problems only when he 
recognizes some intimate personal relationship to the 
problem or group in question. 8 
INACTIVE: Seems aware of social problems, and may profess 
concern about them, but does nothing. 4 
UNCONCERNED: Does not show any genuine concern for the 
common good. 5 
EMOTIONAL RESPONSIVENESS 
TO IDEs: Is emotionally stirred by becoming aware of chal- 
lenging ideas. 1 
TO DIFFICULTY: Responds emotionally to a situation or prob- 
lem challenging to him because of the possibility of 
overcoming difficulties. 2 
TO IEats: Responds emotionally to what is characterized 
primarily by its personal or social idealism. 3 
TO BEAUTY: “Responds emotionally to beauty as found in 
nature and the arts. 4 
TO ORDER: Responds emotionally to perfection of function- 
ing as it is seen in organization, mechanical operation, 
or logical completeness. 5 
SERIOUS PURPOSE 
PURPOSEFUL: Has definite purpose and plans and carries 
through to the best of his ability undertakings consist- 
ent with this purpose. 1 
LIMITED: Makes plans and shows determination in attack- 
ing short-time projects that interest him, but has not yet , 


thought out goals for himself. 
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Type 


POTENTIAL: Takes things as they come, meeting situations 
somewhat on the spur of the moment, yet may be capa- 
ble of serious purpose if once aroused. 

UNRELIABLE: Makes plans that are fairly definite, but cannot 
be counted on for the determination to carry them 
through. 

VAGUE: Is likely to drift without the decision and persistence 


that will enable him to carry out his vaguely conceived 
plans. 


SOCIAL ApyjusTABILITY 


SECURE: Appears to feel secure in his social relationships and 
is accepted by the groups of which he is a part. 
UNCERTAIN: Appears to have some anxiety about his social 
relationships although he is accepted by the groups of 
which he is a part. | 
NEUTRAL: Shows the desire to have an established place in 
the group, but is, in general, treated with indifference. 
WITHDRAWN: Withdraws from others to an extent that pre- 
vents his being a fully accepted member of his groups. 
NOT ACCEPTED: Has characteristics of person or behavior that 
prevent his being an accepted member of his group. 


Work Hanrrs 


HIGHLY EFFECTIVE: A pupil having highly effective work 
habits would be likely to reach the maximum accom- 
plishment for one of his ability, 

ADEQUATE: A pupil having adequate work habits would ac- 
complish all that would commonly be expected of one 
of his ability, 


PROMISING: While his habits are not yet adequate, they 
show promise of becomin 


3 
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It will be seen that the subheads under “Emotional Re- 
sponsiveness” are not exclusive, since a pupil might respond 
to any number of them. In this respect the treatment of this 
characteristic differs from that of the others. 

The key for recording teachers’ judgments, which a school 
can extend as it seems necessary, lists abbreviations that 
show the type of opportunity a teacher has had for observ- 
ing the pupil being described. 

The following example will show how this is used. 

Under "Responsibility-Dependability" six types of behav- 
ior are defined. Thev will be listed by their numbers and 
key words, and the judgments of nine teachers about a pupil 
Will be shown as they would appear on a filing card: 


1 Responsible... M—HR 
2 Conscientious ..... з N пано ы 
A—Mu 


8A Generally Dependable — 
3B Selectively Dependable ..---+--+ 
4 Unreliable ......e rete P 
5 Irresponsible MENT a ee 


This indicates that the teacher of mathematics and the home- 


room teacher believe the boy fits the definition of Type 1, 
that teachers of natural science, social science, English, and 
French place him as Type 2, that art and music teachers 
Would describe his behavior as of Type 3A, while the one 
in charge of physical education would place him under 
Type 4, 

The total picture of this boy’s behavior (but only in re- 
Spect to his responsibilities ) shows him to be highly con- 
Scientious in meeting the demands of academic work and 
of the group (home-room ) with which he is closely con- 
nected. It also shows that for some reason he is not so highly 
dependable in the arts, and that he is failing to meet with 
any consistency the obligations that are related to physical 
education. It is not, of course, safe to make positive judg- 
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ments about the arts and the physical education from this 
information alone. Evidence about the other characteristics 
may throw light on what is shown here, and personal rela- 
tionships, home obligations, or other factors may enter into 
the situation. 

It is evident from this example that a principal, super- 
visor, or guidance officer can not only obtain information 
from the numerical distribution of judgments and the situa- 
tions in which extremes of behavior occur, but also can take 
into account what he knows about teachers and courses, in 
this way reaching a more accurate understanding of the 
pupil than would otherwise be possible. 

While one outside an institution cannot obtain so com- 
plete an understanding as this, information from this card 
and the comment of a supervisor, recorded on such a form as 
that used for transfer to college (Chapter XII), can give a 
very accurate description for the use of a college admissions 
officer or a prospective employer, 

The fact that the classifications under any heading on the 
card were not intended to constitute a rating scale, cannot 
be too strongly emphasized. The committee was also agreed 
in the belief that the classifications obtained could not even 
be said to define orders of excellence, since there was no 
certainty that some earlier classes were better than others 
that were later in the lists, nor that behavior of a certain type 


was best for all kinds of people under all kinds of conditions. 
It is true that the first cl 


havior that would be co 
last are, in general, not 
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The classifications are therefore simply items of the de- 
scription of a person in terms of his behavior under various 
conditions, as judged by a number of practiced and suppos- 
edly impartial observers. It is of course true that the limited 
number of descriptions cannot exactly describe all possible 
kinds of behavior. It is believed, however, that the definitions 
will usually fit closely enough for practical purposes, par- 
ticularly since when necessary they can be modified by fur- 
ther comment. 

In addition to the characteristics so far listed there are 
four on the card about which the only judgment asked for 
each is whether it is present or absent to a marked degree. 
The four, which are defined on the blank, are PHYSICAL 
ENERGY, ASSURANCE, SELF RELIANCE, and EMOTIONAL CONTROL. 

Two other details are worthy of notice. At the end of the 
printed material there is a place for indicating the judgment 
of the faculty in regard to the success of the pupil in four 
broad fields of thought and activity. These are “abstract 
ideas and symbols,” “people,” “planning and management,” 
and “things and manipulation.” It is thought that where 
there are marked differences in success in these areas the 
evidence may prove valuable in guiding a pupil toward 
Suitable after-school experiences. The information may help 
to decide whether or not the pupil should go to college, and 
if so to what kind of a college, whether or not he should 
undertake some form of specialization, what kind of a job 
he should try to obtain. 7 

The other detail is the large space left for “comment.” 
This is useful for the recording of information that explains, 
amplifies or brings into relationship the description on other 
parts of the card. 

Successful use of the behavior description material re- 
quires study of the manual and careful following of its direc- 
tions. At first this may seem to require more time than a 
teacher is able to give. However, the time needed for re- 
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cording will grow rapidly less as one becomes familiar with 
the method used, particularly if a teacher is already observ- 
ing and analyzing the behavior of his students to the extent 
any good teacher should. It is the conviction of the commit- 
tee that time spent in better understanding of a pupil does, 
in any case, justify itself in better relationships and more 
effective work. 

It is interesting to know in this connection that one pub- 
lic school system has adopted this form for the study of 
12,000 pupils in junior and senior high school and expects 
Soon to extend it to another 6,000 pupils. Some colleges, as 
has been said, have found the card valuable in obtaining 
and recording facts about behavior, and many types of 
schools are experimenting with the material. Samples have 
gone to other countries, even to Russia and South Africa, 
as well as to most sections of the United States. 


SUMMARY or ADVANTAGES 


This form replaces “rating” as a basis for studying indi- 
viduals by description of behavior as observed by adults 
having a variety of associations with the one studied. 

In general it shows, for any char. 
common behavior and rang 
appears, the judgments bei 


acteristic, a pupil's most 
of behavior. Where no mode 
ng so scattered as to have no 
If has significance, the particular 


implications depending on the pattern of judgments and the 


characteristics in question. 
Taken as a whole, the card wh 
ably complete picture of the 


him. 


transferred to a cumulati 


ve record card or a coliege entrance 
blank, or be used as a 


basis for an interview with parents. 
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On a college entrance blank the information can show the 
pupil’s most common behavior and the number of reporters 
who observed it, and can indicate the range and under what 
conditions extremes occur. The form in Chapter XII shows 
such a transfer from this card. 


Chapter XI 


TEACHERS’ REPORTS AND REPORTS TO 


THE HOME 


сс 


During the Study various schools wrote to the chairman of 
the Committee on Evaluation and Recording asking about 
tendencies in reports to parents and expressing dissatisfac- 
tion with existing forms. A sub-committee! was therefore ap- 
pointed to investigate the practices of schools, to analyze 
tendencies in reporting, and to make recommendations of 
forms for teachers’ use and for sending reports to the home. 
This committee’s first step was to collect report forms from 
schools of various kinds, and to ask the schools to say how 
and why present practices were unsatisfactory and to com- 
ment on what reports should be. The report cards obtained 
were carefully studied, and the criticisms and suggestions 
sent in by the schools were analyzed. Quite a number of 
schools, however, sent no forms, saying that they had noth- 
ing that would be of any help in the undertaking. It became 
clear at once that the most general demand was for some- 
thing that would replace numerical or letter marks, and 
would give more usable information about a pupil's strengths 
and weaknesses, 

convinced that the single mark in a 
subject hid the facts instead of showing them clearly. The 
› an average of judgments about various 
gress that lost their meaning and 


; Burton P, Fowler. I. R. Kraybill, 


Elvina Lucke, Eugene R. Smith, Chairman, John W. M. Rothney, Research 


Assistant, 
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their value when thus combined. The schools believed that 
the value of a judgment concerning the work done by a 
pupil in any school course or activity depended on the 
degree to which that judgment was expressed in a form 
that showed his strengths and his weaknesses and therefore 
presented an analyzed picture of his achievement that would 
be a safe basis for guidance. 

There was also a feeling that marks had become competi- 
tive to a degree that was harmful to both the less able and 
the more able, and that they were increasingly directing the 
attention of pupils, parents, and even teachers, away from 
the real purposes of education toward the symbols that 
represented success but did not emphasize its elements or 
its meaning. 

The commonest method of replacing marks proved to be 
that of writing paragraphs analyzing a pupil's growth as 
‘seen by each teacher. This method is an excellent one, since 
good descriptions by a number of teachers combine to give 
а reasonably complete picture of development in relation 
to the objectives discussed. On the other hand, a report in 
this form is very time-consuming for teachers and office, as 
well as difficult to summarize in form for use in transfer and 
guidance. The committee decided on a compromise that 
would make place for giving definite information about im- 
portant objectives in an abbreviated form and would allow 
for supplementing this with written material needed to mod- 
ify or complete the information. 

To find the objectives, the list collected by the Evalua- 
tion Staff and the forms worked out by the committees for 
the various subject fields ( Chapter XIII) were studied. It 
Was discovered that there were five objectives that were 
common to all fields and experiences, and about which 
knowledge would be particularly valuable to parents as well 
as to pupils. ‘These five objectives were therefore chosen as 
headings to be reported on by all teachers and to be used 
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in reports to the home. The wording adopted for them is 
not, however, identical with the wordings on the forms used 
in subject fields. The reason is that this committee had to 
draw from the large amount of information asked for on 
the subject forms that which could be condensed into sim- 
ple phrases that would have meaning and importance on a 
report to the home. The headings follow: 


Success in Achieving the Specific Purposes of the Course 
Progress in Learning How to Think 
Effectiveness in Communicating Ideas: 

Oral 

Written 


Active Concern for the Welfare of the Group 
General Habits of Work 


The question of classifications to indic 
cess or growth in relation to these objectives proved a diffi- 
cult one. After much discussion and experimentation it was 
decided to take as a point of departure the usual expecta- 
tion for one of the age group and the background of the 
pupil in question. Two classifications above and two below 
are used. They are defined as follows: 


I5 OUTSTANDING: The pupil has reached an outstanding stage of 
ment in the characteristic and field indicated: that 
al for pupils of the same 


ate degrees of suc- 


as reached a stage of development 


ig han usual, perhaps with promise of even- 
tually reaching a Superior leve], 


IS AT USUAL STAGE: Th 
of development fo 
IS BELOW USUAL: The ili 


particular help from the home and 
school or greater effort on the part of the pupil. 


IS SERIOUSLY BELOW: The pupil is seriously below-an acceptable 
standard in the field indicated, 
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In this particular these forms depart somewhat from the 
descriptive method that is emphasized in the work of all the 
committees, though taken as a whole these blanks are still 
highly descriptive. This departure, however, should not be 
thought of as too inconsistent, since the purpose of these 
forms affected to some extent the method to be used. It 
seems likely that the time will come when each pupil is 
judged primarily in accordance with his ability and his op- 
portunities, rather than in comparison with others. There is 
still demand, however, for information that will tell parents 
with some definiteness where their children are showing 
strengths or weaknesses as judged by normal expectations. 
These forms try to meet that demand and at the same time 
to describe the pupil’s progress in a way analytical enough 
to give helpful guidance. 

In addition to the section that tells the degree of success 
a pupil is achieving in the five objectives listed, there are 
three other sections of the report. The first gives opportunity 
for the teachers to point out weaknesses a pupil should par- 
ticularly try to eradicate. There are eight of these listed, and 
the subjects in which the weaknesses are evident are shown 
on the home report: 


Accuracy in following directions 
Efficient use of time and energy 
Neatness and orderliness 
Self-reliance 

Persistence in completing work 
Thoughtful participation in discussion 
Conscientiousness of effort 


Reading 
There is also opportunity for the teachers to report on 
the pupils’ likelihood of success in continuing to work in 
their fields, both in later years in school and in advanced 

institutions, 
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A section for “General Comment” appears on the teacher's 
report, and on the report to the home. Some schools copy the 
most valuable of the teachers’ comments upon the home re- 
port form. Others summarize criticisms and suggestions in 
this space. Occasionally so much of value should be sent 
that an attached sheet must be used, but in general the space 
for comment seems to be sufficient. , 

In all the details that have been mentioned the teachers 
report and the home report are identical, although they dif- 
fer in arrangement, since the home report is designed to 
combine the reports of all the teachers into a single form 
that can be read easily. 

There are two forms of the report to the home. They in- 
clude the same material but differ in arrangement in a way 
that produces somewhat different emphases. Form A tends 
to emphasize the objectives in which a pupil is strong or 
weak, while Form B goes further in showing a pupil's degree 
of success in individual subjects. A school can choose either 
form or can do as a school represented on the committee has 
done. This school liked the completeness of the teachers 
reports so well that it decided to send copies of all of them 
to the parents instead of using the combined report form. 

While one of the greatest values of these forms is the way 
in which they provide for guidance by analyzing a student's 
progress instead of trying to express several factors in one 
"mark," the form has other advantages. 

An important one is the degree to which it directs the 
minds of pupils, parents, and teachers away from marks to- 
ward the fundamental objectives with which pupils should 
be concerned. Incidentally, in this procedure it is not easy 


in a way to make the less able pupil 
c e able one become smug, for in such 
an analysis even the poorest student is likely to find some 


appreciation, while the best student is likely to discover some 
weaknesses to be correcte Я 
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It hardly seems necessary to point out the fact that this 
form, like the “Behavior Description,” attempts to describe 
somewhat fully a phase of the behavior of a person. In this 
case, it is principally the pupil as one who is learning and 
developing mental power that is observed. As in the other 
form, the pupil is studied by a number of teachers, and the 
mode and distribution of response in different environments 
is recorded. The comment appearing on the form sent to 
the parents becomes an analysis of what is shown under the 
various headings, and a recommendation of ways in which 
the pupil can be helped to overcome his weaknesses and use 
his ability more effectively. 

A word of warning about the introduction of such report 
forms may not be amiss. Pupils and parents should receive 
some explanation of the meaning of the information given 
so that they will not be confused by the very completeness 
of what is said and will not be antagonized by the unfamiliar 
material. 


Chapter XII 


FORM FOR TRANSFER FROM SCHOOL 
TO COLLEGE 
«есесесесесесесесесесесесесссесесесесесесесессее 
CONFIDENTIAL REPORT TO THE COMMITTEE ON ADMISSION 


The need for a new transfer form has been widely recog- 
nized. Schools everywhere wish a uniform blank, since the 
present waste of the time of school officers, because of the 
wide variety of forms used by different colleges, has reached 
serious proportions. 

Recognition of the extent to which marks and “units” are 
preventing schools and colleges from giving their best serv- 
ice to individual students, and are interfering with educa- 
tional progress, also becomes daily more widespread. The 
reasons for replacing marks by analyses were discussed in 
relation to reports to the home. Units, too, become the ob- 
jectives for which pupils strive, sometimes with little con- 
sideration of the methods by which they are obtained. In 
many schools, also, reorganized courses, activity programs, 
and long time researches (though on a secondary school 
level) have so changed the schedule that the definition of a 
unit no longer has meaning.’ A college entrance form with 
less emphasis on marks and units can help greatly toward 


overcoming the abuses that are of so much concern to the 
schools. Then, too, it is incr 


tion should have a degree 
existed, and th 
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provided by the schools for use in college. The entrance 
blank seems a natural place for such information. 

As an example of this general movement, the Committee 
on School and College Relations of the Educational Records 
Bureau, which is composed of school and college represen- 
tatives, has sent bulletins to the colleges emphasizing needed 
changes in information required at entrance, and has pub- 
lished? the answers of the colleges, which show quite gen- 
eral willingness to cooperate in making the changes. Another 
bulletin has recently been sent to the colleges, and the an- 
Swers will soon be published. A striking example of the inter- 
est taken by educators in the various needs being discussed 
is the fact that the Educational Records Bureau Committee 
has given the Committees on Records and Reports of the 
Progressive Education Association standing as sub-commit- 
tees of its own in order to keep in touch with their work, 
and to lend its support to whatever promises progress in 
better school and college relations. 

This dissatisfaction with entrance blanks was focussed by 
the necessity, under the Eight-Year Plan, of developing an 
entrance form that would accomplish two objectives: 


1. Have such a range of flexibility and such carefully 
chosen items that it would not restrict any school’s 
curriculum or methods. 

Provide for information complete enough to replace 
effectively the data that was omitted under the spe- 
cial plan {ог the cooperating schools, and significant 
enough to assist in the guidance programs of the 


to 


colleges. 
The Committee on Evaluation and Recording appointed a 
Sub-committee? to work on this problem. This committee, 


? Published by the Educational Records Bureau, 437 West 59th Street, 
New York City. ; 

з The members of this committee were: Victor L. Butterfield, Genevieve 
L. Coy, Albert B. Crawford, Ruth W. Crawford, Burton P. Fowler, Elvina 
Lucke, Herbert W. Smith, Eugene R. Smith, Chairman, Arthur E. Traxler, 
John W. M. Rothney. 
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after studying previous reports on the subject, explored the 
forms in use, especially those prepared by groups of colleges. 
All forms that had wide use were analyzed, and their items 
were listed with ratings of their prevalence in present blanks. 

The committee also asked schools for their criticisms of 
entrancé blanks and their suggestions for improvement, and 
on the basis of the two surveys a new blank was devised and 
has been in use by the cooperating schools with the very 
large number of colleges to which they send students. 

The first page of the form’ is given over very largely to a 
tabular history of the courses the pupil has taken in school, 
and a combined recommendation and prediction for work 
in college. This table allows a school that wishes to do so to 
record only traditional marks and units, but it also allows for 
courses not easily expressed in units and not recorded by 
marks, since it has space for final recommendations in the 
major departments most likely to be presented for entrance 
or followed in college, and provides blank spaces for addi- 


tions. If this form were being prepared now it would prob- 
ably have no column for 


the movement for omission of unit equivalents in Statements 


of Credit had not reached the point it has since attained. 
The second page is given to test records and includes a 
blank space for “Summary Interpretation” of tests whose 
results are not easily expressed in numerical forms. Such tests 
include ones described in the “Evaluation” section of this 


report, as well as tests of primary abilities and others that 
have important sub-heads. 


The particular contr 
form for the descriptio 
characterization of hi 


ibution of the third page is the tabular 
п of a pupil's behavior, and a resulting 
m. The table is based on definitions of 
the characteristics and the sub-heads under them as they 
are given in the “Manual of Behavior Description,” and is 
supposed to be used with those definitions. (See Chap. X.) 

* The form is between рр. 469-497, 


units, but when it was being devised . 
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The method of recording, which reports the judgments of 
all the teachers dealing with a pupil, gives two very important 
facts about his behavior in respect to any one of the charac- 
teristics: 

1. His most common type of behavior. 
2. The range of behavior on one or both sides of the 
modal heading. 

For example: 
work Highly effective Adequate Promising Ineffective Limited 


HABITS English M-5 Math. Sci. 


This would indicate: 

a. that the pupils work habits had been judged by eight 
people, of whom five thought they accorded best with the 
definition of “Promising”; 

b. that in English, because of response to the subject, the 
influence of the teacher, or some other reason, his habits 
seemed “Highly Effective”; 

c. that in mathematics and science his habits were as de- 
fined under “Limited.” 

These facts might have great significance both for con- 
sideration of a candidate for college, and for guidance if he 
was accepted. 


A school that did not wish to use any tabular method of 


description might omit the use of this table and describe the 


candidate in paragraph form on the next page. 

The fourth page is left for the school’s comments. It may 
replace the table on page three but in any case it gives the 
opportunity to supplement, modify, and summarize the rest 
of the blank. It ends with a place for the definite recommen- 
dation of the school head, an item that all colleges seem to 


value. 

Other items on the blank are self-explanatory and differ 
only slightly from commonly used headings. 

All the items most commonly asked for by the colleges, and 
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possible for the schools to furnish, are included on the blank, 
while those that have been found to have little importance in 
actual use have been omitted. An occasional college asks for 
one or two additional facts, which can usually be given under 
“Comment” if no other place seems more suitable for them. 

This form has been in successful use for four years, and 
its use is spreading to schools outside of the Study, sometimes 
through initiation by a school, sometimes through its adop- 
tion by a college. It is hoped that in its present, or a modified, 
form it will show the way to a uniform blank for the schools 
and colleges of the country.^ 

A reproduction of the blank, filled in, follows. The use of 
"C" to show predicted success if a Subject is "continued," 
and of “U” to show ability to "use" it in other fields if it is 
not continued in college should be noted, “U” is not entered 
unless the prediction for continuance is not high. 


Tue “Junior Year” BLANK 


An increasing number of colleges are interested in obtain- 
ing information about candidates when they are in the elev- 
enth grade. Information at that time need not be so complete 
à in the twelfth grade, but it should follow much the same 

ines. 


To supply this need a preliminary report form was also pre- 
pared and is in use by the schools. 
5 Ап important contribution in this res 


the publication of a blank repared by a 
of associations. See Appendix, p. 508. 


spect has recently been made by 
committee representing a number 


Chapter XIII 


STUDY OF THE DEVELOPMENT OF PUPILS 
IN SUBJECT FIELDS 
KEKEKE Ge GC UII IGI GC EEE EERE 
Departments in the various subject fields studied their ob- 
jectives more intensively during the early years of the Eight- 
Year Plan than the teachers concerned, or perhaps any group 
of teachers, had ever done before. It became evident in this 
study of objectives that teachers in general, even excellent 
ones, were not fully aware of any but the most general, and 
therefore vague, purposes for which they were supposed to 
be working, and that they often had little appreciation of the 
importance of the changes that were brought about in their 
pupils by the experiences of school and out-of-school life. 
As a matter of fact many an instructor is teaching in his par- , 
ticular subject field (or is teaching at all) only because he 


found that subject easy and so made a good record in it him- 
on or presents material to his classes, 


ccess in learning, but he never looks 
nses and thought proc- 
stages through which 


Self. He assigns a less 
expecting a certain su 
deeply into his pupils’ emotional respo: 
esses or analyzes the developmental 
they pass, and the reasons for them. 
Because of increased realization of the need for a more 
analytical approach to the problems of teaching, a demand 
arose for help in making and keeping teachers aware of the 
aims for which they should strive. A committee’ was there- 
fore appointed to investigate methods of recording that might 


Serve such a purpose. 

1The members of this committee were: Helen M. Atkinson, Genevieve 
L. Coy, Harry Herron, G. H. B. Melone, Edith M. Penney, Eugene R. 
Smith, Chairman, Arthur Traxler, John W. M. Rothney. They were assisted 
by a very large number of school and college teachers who contributed 


Breatly to the undertaking. 
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The original committee included specialists in various 
fields, as well as executives. Its first conclusion, resulting from 
a comparison of objectives of large numbers of teachers, was 
that, while it did not seem possible to make one form that 
would be suitable for use in all the fields of knowledge and 
activity, it would be possible to develop separate forms for 
those fields that would not only be consistent, but would 
parallel each other in many respects. 

Further experimentation convinced the group that the 
work should be done largely by specialists in the various 
fields, assisted by some members of the general group who 
had studied recording intensively. 

The first detailed attack on the problem was made by di- 
viding the original committee, according to its subject inter- 
ests, into those who would work in English, social studies, 
mathematics, and science, and by inviting other school and 
college representatives to join thes 
started with a discussion of th 
general problem, after 
coming together again 
second day, 

A very significant development was the increase in breadth 
of thinking that came to all of the groups, the growth in rec- 
ognition of the Similarity of purposes in different fields, and 
an appreciation of the importance of common and correlated 
effort to achieve such purposes. Not only did the groups in 
mathematics and science spend much time working together, 
but the mathematics group asked the teachers of social stud- 
ies to consider a question with them, or some other combina- 
tion attacked a problem together. After preliminary forms 
were made, other teachers and schools were asked to criti- 
cize them, and eventually through really grueling work car- 

ried on with considerable sacrifice by some of the workers, 
four forms were arrived at. . 


When this stage was reached, others were invited to join 


€ groups. Meetings usually 
€ questions involved in the 
which the four groups met separately, 
to report progress at the end of the 


RECORDING STUDENT PROGRESS 501 


the committee and forms were added for foreign languages, 
art, music, physical education, and homemaking. 

It was expected that two forms might be needed for for- 
eign languages, one for the modern and the other for the clas- 
sical languages, but as the work went on it seemed likely that 
one form could well cover the objectives for both divisions. 

Two comments have special significance regarding all the 
forms. The first is that it proved impossible in any field to 
limit the objectives to a number that teachers in general 
would be able to use. The main headings under which judg- 
ments can be made are reasonably few, but the sub-heads 
considered important by the committees increase the possible 
number of judgments to a point where few teachers would 
have the time to make so complete a study of their pupils. 
This may be a strength instead of a weakness, for it brings 
in enough flexibility to enable any school or teacher to choose 
the objectives that fit the aims of the institution or the teacher, 
and to concentrate on the study of their degree of attainment. 
The record is, then, just as simple, or as extended, as one 
Chooses to make it. It depends absolutely on one's judgment 
às to which objectives are important enough to justify careful 
study of each pupil's development in respect to them. 

The second comment concerns the “Behavior Description” 
section on the back of each card. Each committee that ana- 
lyzed and stated the aims of its department included develop- 
Ment in respect to most of the characteristics in the “Be- 
havior Description” list. Each group eventually realized that 
these characteristics had already been exhaustively studied 
bya very competent committee, and that there would be no 
advantage in duplicating that work, even if it were possible 
to do so. Accordingly, the committees made places for the 
“Behavior Description” in abbreviated form on their prog- 
ress cards. It must be understood, however, that this part 
of the cards can be applied with full effect only through use 
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of the definitions of characteristics and classifications ex- 
plained in the Behavior Description section of this report. 

A valuable feature of most of the cards is their inclusion of 
a prediction of future success in the field in question. This is 
meant to be a basis for the prediction on the “Confidential 
Report to the Committee on Admissions.” Information under 
“Significant Interests” and the headings following that one 
are also valuable for transfer as well as for guidance. 

The committees endeavored to make these cards as nearly 
self-explanatory as possible, both in the listing of objectives 
and the explanation of methods of recording. Here too, how- 
ever, it must be emphasized that in recording the pupil as 
high, modal, or low in терага to any objective, the teacher is 
indicating the kind of growth the pupil is making rather than 
giving him a mark, The pattern of judgments about the ob- 
jectives considered should show where the pupil is develop- 
ing well, and where poorly, and should thus provide data for 
helping him. 

Unfortunately the commit 
cards for all the purposes tl 
is likely that the most imp 
courses that either include t 
and social studies, or 


tees were unable to prepare such 
hat might have proved useful. It 
ortant omission concerns “core” 
wo or more fields, such as English 
are concerned primarily with the life 
needs of the pupils. It seems possible, however, that objec- 
tives not much different from those that would have been 
chosen for such a course can be found on the card for “Social 
Studies,” and that this card can therefore be used without 
serious disadvantage. There have been requests for cards for 
drama and for instrumental music also, and such cards may 
yet be devised. 

Perhaps in no kind of recordin 
so critical as in that to be used ir 
less one has studied the detailed objectives in a field, the 
more likely he is to overlook the implications in'such lists as 
are on these forms. The committees, though they make no 


5 is a teacher likely to be 
n his own subject, and the 


Н 
CHOOSIODAL OR USUAL FoR Аве (М), or Low(L) 
BY СНЕ АТ LEAST A FAIRLY DEFINITE OPINION, 
MAIN ITO INDICATE A SERIOUS LACK. 
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extravagant claims for their product, hope that anyone inter- 
for careful consideration 


ested in such forms will take time 
before deciding that the cards do not quite adequately serve 

the purposes for which they were designed. It should be 

noted, for example, that "conscientiousness, " which most 

teachers would expect to find in the list, is not on the front 

of the card because it is included under “Responsibility- 

Dependability” in the Behavior Description on the back of 

the card. Some headings that at first thought seem essential 
appear in less general form, or are included in more gen- 

eral statements. On the English card, for example, “Skill in 

obtaining information other than from books,” is included, 
while the more common and important (in this field) pur- 
pase of obtaining information from books ìs omitted. № is 
omitted because it is too important and so must appear in 
More analyzed form. It will be found in such headings as 
those under “Techniques and Skills" in “Use of Various 
Reading Techniques,” and in the “Reading Record.” It is of 
course included in “Mastery of Essentials of the Course.” 

To show the method and organization used for these cards 
the front of the English card is reproduced here. 

The back of the card includes, as has been said, the 
Behavior Description (Chapter X) but uses only the key 
Words, the definitions being omitted. It also has spaces for 
recording the results of comparable tests, and for making 


notes about: 
Significant Interests, Activities, 
Special Abilities 
Significant Limitations 
General Comment 


and Accomplishments 


The cards in the other subjects follow the sn general 
P'an as the English card, but they differ in detai s in ac- 
cordance with the particular purposes of the various courses. 

ер d would require 


e М 
Se differences are not liste 
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what would approximate a reproduction of all the cards, 
and in a rather confusing arrangement. It seems much better 
for one interested in a particular field to obtain a sample 
card for that field, in order to study it as a whole. 

These cards differ from the ones described in other chap- 
ters because while the others are primarily office forms, these 
are just as definitely teachers’ forms, planned to help the 
teachers in their study of their pupils, and to serve as source 
material for the other records. From them can be taken the 
teachers’ judgments for entering on the “Behavior Descrip- 
tion,” and much that goes on the “Form for Transfer from 
School to College.” They serve as a basis for the teachers’ re- 
ports that become reports to the home. If a cumulative record 
form is kept, much of the information on it must come from 
the teachers’ cards, It seems, therefore, that these cards, 
except when data is being taken from them, might well 
remain in the hands of the teachers, serving as reminders of 


objectives and offering the Opportunity to record information 
whenever it seems timely. 


APPENDICES 
X I ee IC Ie ССС 


Appendix I 


Ree lle el le IC EE EE IE CRI UC IE GC CC C GC 
Cumutative Recorp FORM 


(Prepared by a Committee of the 
American Council on Education! ) 


As was said in Chapter IX, no work was done by the Com- 
mittee on Evaluation and Recording on а cumulative record 
form for the use of school offices because the American Council 
revise the form that had been 


on Education was planning to т 
Used so widely since its publication in 1930. The revision for 


Secondary schools has now been completed and the card can be 
obtained from the Council's office in Washington. It accords with 
the principles and methods of the other forms described in this 
volume, and so fits well into the set from which a school can 
Choose its equipment for recording. 

The cumulative record form is a double sheet of tagboard that 
fits an S" by 11" file. It furnishes space for all the commonly 
recorded facts about a pupil and his family, and for a six-year 
history of his school career. 

One of the largest spaces on the card is given to the history 
and analysis of the pupil's progress in subject fields. This allows 
Opportunity for whatever type of reporting а school uses, though 
the directions suggest some form of analysis such as is described 
m Chapter XI. ‘Alternative forms provide for recording test 
results in tabular or graphic form, and there is also provision for 
Interpreting the test record in relation to the pupil's academic 
achievement. 

The “Description of Behavior" section uses material from the 


card and manual described in Chapter X, and adds spaces for 
advice by guidance officers, and for follow-up after the pupil 


leaves school. 
erintendent, Providence, R. L; Millard 


1 8. Ы 
Richard iate S 
rd D. Allen, Associa Oy. lliam S. Learned, Carnegie Foundation 


E. Gladfelter, Temple University; Wi 
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UNIFORM COLLEGE ENTRANCE BLANK 


In 1941 under the joint auspices of The American Council on 
Education and The National Association of Secondary School 
Principals a committee was appointed representing these m 
ciations, and the New England Association of Colleges and 
Secondary Schools; the Middle States Association of Colleges 
and Secondary Schools; the North Central Association of а 
апа Secondary Schools; the Southern Association of Colleges und 
Secondary Schools; the Progressive Education Association; "i 
American Association of Collegiate Registrars, for the purpose о 
considering the demand for an improved and uniform college en- 
trance blank. The chairman and secretary of the Committee on 
Evaluation and Recording were members. | 

This committee considered blanks already prepared by various 
groups and agreed upon a form which has now been published 
by the National Association of Secondary School Principals and 
can be obtained from its office in Washington, D. C. 

While this form is much more condensed than that prepared 
for the Eight-Year Study, having in particular a limited space 
for free comment about the candidate, it has much in common 
with that form and recognizes much the same educational prin- 
ciples. It offers opportunity for the use of analyses or predictions 
instead of marks if a school prefers them, omits any reference to 
units, provides space for annual tests, and gives emphasis to the 
description of behavior. 

This form shows marked progress toward present-day objec- 


tives and promises to influence school and college relations con- 
structively. 


for the Advancement of Teaching; John W, M. Rothney, Wisconsin Univer- 
sity, Secretary; Donald J. Shank, Assistant to the President, American 
Council on Education; Eugene R. Smith, The Beaver Country Day School, 
Chairman; Arthur E. Traxler, Educational Records Bureau; Edmund С. 
Williamson, University of Minnesota; Ben Wood, Cooperative Test Service. 
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TABLE 2 


Correlations between Scores on Form 2.51 (Corrected for Attenuation) for 
284 Pupils in Two Large Public High Schools not in the Fight- Year Study 


A erg Accuracy 
ccuracy| wit E 
Sire with |Probably| c Beyond Gain Crude 
Truc- [Truc and ficient Data Errors 
False |Probably. Data 
False : 
General 
Accuracy 766 .650 .786 | —.734 | —.132 | —.880 
Accuracy with 
true-false .470 .314 .075 | —.359 | —.882 
Accuracy with 
probably 
true and 
probably 
false —.002| —.052 | —.741 | —.30t 
Accuracy with 
insufficient 
data es —.719 
Beyond data "я 2 p 1631 
Caution — 150 
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TABLE 4 


Correlations between Certain Scores on Form 1.3b for 283 Pupils 
in Two Schools in the Eight-Year Study 


Г] | | | 5 
а 2/2 |а | = |5 
"E: DS | Es |=, 
| ERE | oË glee 
ЕЕЕ ES SEE Bee 
Score B3 3 S2 82222222 Ва 
28| 2 |22\22)2 6/3 212 3/28 
Column) 1 | 8 | 9 | 12 | 16 | 18 | 19 | 25 
Uncertain rea- 
sons, lack of | 
knowledge 5 34 
Number reasons F m 
Right reasons 8 53 
Percent reasons 
right 9 | 68 
Number 
principles 11 42 
Number right 
principles 12 53 | .85 
Percent 
principles right 13 62 
Number right 
controls 16 38 .04 
Number analogies 18 11 
Number right 
analogies 19 -21 | .70 .59 | .01 
Number 
authorities 21 .61 
Number right 
authorities 29 58 .40 | .01 .54 
Percent ridicule, 
Tel., A. C. 25 —.46 
: Percent 
inconsistent 27 —.20 24 à 
TOU ааны ЧИН eds died ЕЕ. 
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Appendix III 


TABLES FOR CHAPTER III 
EEE ссе «еве 


ТАВІЕ 1 
Means, Standard Deviations, and Reliabilities for Test 1.41 


Grade 10 Grade 11 Grade 12 Total 

Lus eS E 

Mean|Sigma | г |Mean|Sigma | r [Mean Sigma | r |Mean|Sigma | г 
Total Reasons!.......|54.2 | 11.1 |.82,55.3 | 13.7 |.87/46.8 | 11.9 |.89 51.8 | 12.9 57 
Accurate Кеазопз!..../37.5 | 8.3 |.85|39.8 | 9.5 |.8936.5 | 9.0 |.8737.9 | 9.0 E 
Ва, иза 4.5 .98.82| 4.9 | 1.1 .87| 4.5 | 1.1|.84| 4.6] 1.1 09 
No. Inconsistent! 6.4) 4.2 |.78| 5.7 | 4.3 |.76| 3.5 | 2.6 |.65| 5.1 | 3.9 |- 
% Inconsistent! 9.1 | 8.0 8.0 | 7.5 48| 5.5 7.2) 7.2 0 
Untenable?. 6.2 | 2.5 |.35| 6.2 | 2.6].44 4.4 | 2.5 |.52 5.5 | 2.7 |.5 
Irrelevant?. »|$.9]| 1.9 3.5] 22 2.$| 1 3.3 | 2.0 РЯ 
Undemocratic Valuest| 6.4 | 4.7 |.84 5.5 | 4.8.80 3.8 | 42 1.86 521 47]. д 
Democratic Values?...|22.3 | 8.6 |.9025.4 | 8.9 |.9123.7 | 9.3 [92238] 9.0 T 
Rationalizationt......| 8.2 | 3.1 |.54| 7.7 | 3.7 |.722 5.9 | зо [63 7.2 | 3.4 or 
% Democratic Values? 62.9 | 16.4 |. 


1 Computed by split-half method. 
? Computed by Kuder-Richardson formula. 
3 Computed by correlating two forms of the test 1.41 and 1,42. 
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APPENDIX e 
TABLE 2 
Reliability Coefficients for Test 4.21-4.31 
9th 10th 11th 12th Tasi 
Grade Grade Grade Grade (60 1) 
(108) (145) (169) (179) 
Liberalism 
D Gol. 1a. ‚74 ‚78 ‚78 .80 .79 
ER E T .78 .80 .84 .81 
LU pem .80 .83 .85 .86 .84 
R ases .81 .84 .88 .86 .86 
N B. iss .66 .75 .79 .80 AT 
M Саая ‚79 .84 .88 .86 .86 
Conservatism 
D dh "aus .62 .70 .76 ‚72. 74 
ER i: E 73 81 ‚78 78 
LU iin ‚70 77 79 78 b 
R As ras .83 71 .83 т .81 
N ы .66 72 ‚75 ‚69 78 
м Йй wats т ‚79 .82 .80 .80 
Uncertainty 
D: Gol. iiss... TT .86 .86 .83 .85 
ER 14..... `81 .82 .82 .82 82 
Lu Boss вз .82 .84 .85 .83 .84 
R бшен ку T 74 79 .83 E 
N dass .78 .82 .80 81 “81 
M б.а 84 .84 .85 .83 84 
Consistency 
D Col. 19..... .54 .42 .57 .32 .56 
ER 20. ses .42 .42 .56 51 51 
LU 2 anion ‚48 .57 .57 .57 E 
R 29. is e 44 .58 .58 .54 
N Dice 23 138 51 154 “46 
M 24..... E .55 .68 .65 “61 
Totals 
Liberalism. ..... .93 .94 .95 .95 .95 
Conservatism....| 91 .92 .94 .92 .93 
Uncertainty... .- .96 .96 .96 .96 .96 
Consistency. .... .82 .78 .87 .88 .85 
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TABLE 4 


Intercorrelations of Certain Scores on Scale of Beliefs 4.21-4.31 


Score 


i 
ЕК 
| 


D 


Liberalism 


Conservatism 
ER 


Appendix IV 


TABLES FOR CHAPTER IV 
а СЕ 


Students from а large public senior high school are the only ones who have taken the 
final revised form 3.32. Eleven classes, distributed as follows, constituted the population. 


TABLE 1 


Grade Boys Girls Total 


10 56 59 115 

11 46 56 102 

12 52 66 118 
[- ul. ee 

Total 154 181 335 


— UM UA E 
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TABLE 2 


Means, Standard Deviations, and Estimates of Reliability of ** Appreciation" Scores 


on Parts I, II, and III of Form 3.32 


Mean с г 

Part І (35 items) 

rade 10. aic ose shape tua ees wee sepa cee 57.0 17.89 .85 
Grada 13. xu. gorad за wae marane, 61.8 17.09 .84 
Grado га Los uod иши „эси eos ы 66.6 17.66 .85 
Part II (40 items) 

Grade 10. sas any шше зыш ces qc 47.0 18.18 .86 
Стор ons аме ый dore asd 52.4 18.84 .88 
Grade. а met sista sees aee os sa, 55.4 17.715 .86 
Part III (25 items) 

Gtader10. uisa coda axes amie aw ane ci 49.7 17.09 «18 
A наннан 57.0 19.32 .80 
лав о.о ы ы мый ылкы 53.6 17.38 :77 

Pewee ee a 80а 


Total (100 items) 
Grade 10 
Grade 11 
Grade 12 


| 
| 
| 
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TABLE 3 


Means, Standard Deviations, and Estimates of Reliability of “ Non- Appreciation" 
Scores on Parts I, II, and III of Form 3.32 


Mean с г 
ee eS an a 
ue I (35 items) T " 

rade 10 36.8 3 
Grade 11 32.3 16.26 83 
Grade: 12. sno = core opie рну one tcn |RSS 29.2 17.01 85 
ам A ж. 47.8 17.08 84 

rade Їй. aus юе» ene Hee eee ен» nds s . E 
Grade 11... xus cet cuoc o en 40.8 18.60 .88 
Gradi 19... ug asas mes ime rin emi in 39.5 15.45 82 
д III (25 items) as 17.90 тт 

rade 10... raus eco it rmn Ее я . 
Grade HA ane. aye BES яка тине SOR TAE 34.2 17.76 79 
Gyad&/d9. iia gen eisdem win ares mee tin 39.2 15.94 74 

| 0 ee Tee 

н (100 items) - е? B 
BETO. escas sain Be renes ; . 92 
rade 10.... siss esene ен d 36.1 16.35 94 
Grade T2, шө sus ыйа re mere HE * 35.0 14.53 92 


Mid NAME EL LLLA 


oe et 
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TABLE 4 


Means, Standard Deviations, and Estimates of Reliability of “Uncertain? Scores on 
Parts I, II, and III of Form 3.32 


Mean c E 

t---—— l =a 
Part I (35 items) 

Grade 10 8.0 8.98 81 
Grade 11 7.9 7.63 78 
Grade 12 6.1 6.75 74 
Part II (40 items) 

айе 0га qus sis san duram spe 7.5 8.01 79 
Grade 11.. 8.8 9.21 84 

(E eNe o E NOE тыл 7.5 8.39 80 


Part III (25 items) 


Grade dse sep дир sarni каше. ? 11.67 .79 

Grade 11.. 11.89 85 

ЕДО ае anis Soir 11.82 77 
расар Ee ins fi 

Total (100 items) 

SPC OE. es ets айу ci ы йк зен 

Grade 11 


APPENDIX 527 


TABLE 5 


Means, Standard Deviations, and Estimates of Reliability of “Appreciation” Scores 
on Parts ПА, IIB, ПС, and IID of Form 3:32 


Mean | c r 

ere E 
Part IIA (10 items) did А 
Grade ТОБ, да азоне es crie НЕ VIP 67.2 20. 
Grade 11 жа 73.6 21.14 65 
Grade 121. («s cere cnn жаз кан mmm n 73.8 21.60 66 
a IIB (10 items) — " 

vade TU «vun wave ann nh data Sw нзр 32.1 23. 
Grade |; et en 36.5 25.10 79 
Grade 12... 25 изе кше cien etin ЫА 41.4 25.67 74 
Part IIC (10 items) 
(petu 10... «oos e devia coe sigs nonne 48.8 24.01 68 
Grade 11 52.2 24.99 73 
(айе 125, ә „лей afro nre 54.1 20.21 59 
p IID (10 items) T "T" E 

b d.e essen ipei nein ns 54. 23. 

cert aepo 60.9 25.45 75 
Grade РОР СА =- 65.8 25.66 75 
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TABLE 6 


Means, Standard Deviations, and Estimates of Reliability of “Non-Appreciation” 
Scores on Parts ITA, IIB, ПС, and IID of Form 3.32 


Mean с г 

Part ПА (10 items) 

Grade 10 30.7 19.23 53 
Grade 11... 23.9 16.69 49 
Grade 12 8.4 18.00 56 
Part IIB (10 items) 

SA OM, aree беду ENSE ora aad 68.4 24.77 ‚72 
Grade Mihi анньа os жаттан ыы. 61.3 28.32 80 
Grade „ез nni шз че» eneen ced 58.7 27.98 79 
Part ПС (10 items) 

Grade TOi css ies ers аара 56.5 24 .69 
Grade 11 51.7 24.55 72 
Gradi I за Н ЧА ыран 50.4 20.07 58 
Part IID (10 items) 

Grade 10... esie ssec cereis 48.5 24.19 69 
up P nes рз ма 39.6 23.04 68 
Стайе 12... 36.8 23.64 73 


Artist 


. Picasso 
. Michelangelo 


. Cézanna 


Corot 
van Gogh 


. Vermeer 


. van Gogh 


Rembrandt 


‚ Dürer 

. Mainardi 

- Breughel 

. Rembrandt 
. El Greco 


. Hals 


‚ Gauguin 
. Breughel 


- Corot 


. Kokoschka 
. Rembrandt 


. Gauguin 
‚ Dürer 


. van Gogh 


. Lorenzo di 


Credi 
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TABLE 7 


Lir оғ PAINTINGS USED IN THE TEST 


Name of the Painting 
The Absinth-drinker 
Head of Adam 


Peasant 


Girl with Pearl 
Self Portrait 


Portrait of a Young Girl 
Self Portrait 

Self Portrait 

Self Portrait 

Portrait of a Young 

Man 

The Winter 

A Boy Reading 

View of Toledo 


A Fool with a Mandolin 


Farm at the Pouldu 
The Summer 
Paysage 


Towerbridge, London 

Jakob blessing Joseph’s 
sons 

Landscape in Britanny 


Self Portrait 


Dr. Gachet 


Portrait of a Girl 


Collection—Catalogue 

Hamburg Museum 

(Detail) Creation of Adam— 
Sistina, Rome 

Conger Goodyear, New York, 
(Venturi, No. 687) 

Louvre (Robaut, No. 1507) 

V. W. van Gogh—Amsterdam 
(De la Faille No. 344) 

Hague, Royal Gallery (Hof- 
stede, No. 44) 

Museum, Amsterdam (De la 
Faille, No. 522 

Kunsthist. Museum, Vienna 
(Hofstede, No. 580) 

Pinakothek-Muenchen (Tietze 
No. 164) 

K. Friedrich Museum, Berlin 
(Cat. No. 86) 

Kunsthist. Muscum, Vienna 
(deLoo A 24) 

Kunsthist. Museum, Vienna 
(Hofstede No. 238) 

Metropolitan Museum, New 
York (A. L. Mayer, No. 315) 

G. de Rothschild, Paris (Hof- 
stede No. 98) 

Collection Vollard, Paris 

Metropolitan Museum, New 
York (deLoo A 25) 

Louvre, Paris (Robaut, No. 
1625) 

Museum, Hamburg 

Gallery, Cassel (Hofstede, No. 
22 

Collection Mesnard, Paris 

Prado, Madrid (Tietze, No. 
152) 

Gallery, Frankfurt M. (De la 
Faille No. 753) 

K. Friedrich Museum, Berlin 
(Cat. No. 80) 
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Artist 


. Picasso 


. Cézanne 


. Vermeer 


27. Durer 


. Corot 
. El Greco 


. van Gogh 


. Hals 

- Cézanne 

. Breughel 

- Kokoschka 

. El Greco 

- van Gogh 

- Michelangelo 
. Rembrandt 

. Cézanne 


. Hals 


Name of the Painting 
The Guitarist 


The Card Players 
The Kitchenmaid 


Hieronymus Holzschu- 
her 


Interrupted Reading 


St. Martin and the 
Beggar 
Pear Tree in Blossoms 


A Mulatto 
A Village 
The Autumn 


Flowers on the Window 

Mater Dolorosa 

Blossoming Almond 
Spray 


Adam, Creation of 
Adam 

А Young Girl at an 
Open Half Door 
Basket of Apples 


The Gipsy Girl 


Collection—Catalogue 

Art Institute, Chicago (Zervos: 
Picasso 1895-1906, No. 202) 

Louvre, Paris (Venturi No. 
558) 

Collection Six, Amsterdam 
(Hofstede, No. 17) 

German Museum, Berlin (Tie- 
tze No. 957) 

Art Institute, Chicago (Ro- 
baut, No. 1431) 

Art Institute, Chicago (A. L. 
Mayer, No. 298) 

Collection V. W. van Gogh, 
Amsterdam (De la Faille No. 
405) 

Museum, Leipzig (Hofstede 
No. 96) 

Collection George Renard, 
Paris (Venturi No. 307) 

Kunsthist. Museum, Vienna 
(deLoo A 26) 

Munich 

Munich (A. L. Mayer, No. 86) 

Collection V. W, van Gogh, 
Amsterdam (De la Faille No. 
392) 


(Detail) Sistina, Rome 


Art Institute, Chicago (Hof 
stede No. 324) 

Art Institute, Chicago (Ven- 
turi No. 600) 


Louvre, Paris (Hofstede No. 
119) 
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TABLE 8 
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List or PAINTINGS USED IN THE COMPARABLE FORM 


Artist 


. Breughel 

. Bronzino 

. van Gogh 

. Rembrandt 


. Roger v. d. 


Weyden 


. Ambrogio da 


Predis 


. Modersohn- 


Becker 


. Breughel 


. Gauguin 
. Michelangelo 


. El Greco 
. Memling 
. Cézanne 
. Vermeer 


. M. Laurencin 


Cézanne 


R. Dufy 


. van Gogh 


. Carl Hofer 
. van Gogh 


. Dégas 


. Dégas 
. Modersohn- 


Becker 


. Breughel 


Name of the Painting 
The Peasants’ Wedding 


Bia de Medici 
Sun Flowers 
Self Portrait 


The Knight with the 
Arrow 

Portrait (Beatrice 
d'Estc) 

Still-life with Flowers 


Fight of Lent with Car- 
nival 

The Girl with the Fan 

Head of the Prophet 
Jeremiah 

Cardinal Fernando 
Nino Guevara 

Portrait Nicolas di 
Sforzore 

The Smoker 


A Lady at the Virginals 


Portrait of a Girl 
Vase of Tulips 


Window in Nice 
Portrait of an Old 


Peasant 
Girls Throwing Flowers 
The Zouave 


Woman Drying her 
Neck 

Girls Ironing 

Still-life with Fruits 


The Unfaithful Shep- 
herd 


Collection—Catalogue 
Kunsthist. Museum, Vienna 
(deLoo A 27) 
Uffizi, Florenz (A McComb, 
p. 61) 
Collection V. W. van Gogh 
Amsterdam (De la Faille 458) 
Louvre, Paris (Hofstede No. 
569) 
Museum, Brussels 


Ambrosiana, Milan 
Museum, Hamburg 


(Detail) Kunsthist. Museum, 

Vienna (deLoo A 2) 
Folkwang Museum, Essen 
Sistina, Rome 


Metropolitan Museum, New 
York (A. L. Mayer, No. 331) 

Spinelli Museum, Antwerp 
(Weale: Memling p. 13) 

Kunsthalle, Mannheim (Ven- 
turi, 684) 

Royal Collection Windsor 
(Hofstede, No. 28) 

Pallas Gallery 

Art Institute, Chicago (Ven- 
turi, 617) 

Art Institute, Chicago 

Collection Bernheim jeune 
Paris (De la Faille 444) 

Art Institute, Chicago 

Collection Unger = Mens, 
Rotterdam (De la Faille 424) 
Louvre, Paris 


Louvre, Paris 


Pennsylvania Museum of Art 
(deLoo A 29) 


ADVENTURE IN AMERICAN EDUCATION 


Artist 


. Goya 


‚ Winslow 


Homer 


. Rousseau, 


Henri 


. Cézanne 
. Rembrandt 
. Rubens 


. Barent 


Fabritius 


. Cézanne 


‚ Rousseau, 


Henri 


- van Gogh 
- El Greco 
‚ Corot 


- Vermeer 
- Manet 


‚ Corot 
. Gauguin 


- Winslow 


Homer 


- Carl Hofer 
. Goya 
- Chardin 


- Vermeer 


. риу 

- Dégas 

‚ Breughel 
- Cézanne 


Name of the Painting 


The Bandit Margato, 
Shot 
The Gulf Stream 


The Cascade 
Man in a Cotton Cap 
Self Portrait 


Portrait of a Bearded 
Man 


Eli and Samuel 
Seine at Bercy 
Summer 
Montmartre 


St. Francis and the 
Skull 


The Haywagon 


The Lacemaker 

Mlle. Victorine as an 
Espada 

Morning on the Lake 

Tahitian Woman with 
Children 

Adirondacks Guide 


Landscape in the Tessin 

Boy on a Ram 

Gir Scraping Vege- 
tables 


Lady with a Lute 


Regatta at Deauville 
L'Absinth 


The Crash of Ikarus 
The Aqueduct 


Collection— Catalog ue 
Art Institute, Chicago (A. L. 
Mayer, No. 597c) 
Art Institute, Chicago 


Art Institute, Chicago 


Museum of Modern Art, New 
York (Venturi, 73) 

Kunsthist. Museum, Vienna 
(Hofstede 581) 

Liechtenstein, Vienna 


Art Institute, Chicago 


Kunsthalle, Hamburg (Ven- 
turi 242) И 
Collection Flachfeld, Paris 


Art Institute, Chicago (De la 
Faille 272) 

Art Institute, Chicago (A. L. 
Mayer, No. 267) 
Collection Dollfus 

No. 1117) 
Louvre, Paris (Hofstede No. 11) 
Metropolitan Museum, New 
York 
Robaut, No. 1625 
Art Institute, Chicago 


(Robaut, 


Art Institute, Chicago 


Art Institute, Chicago у 

Liechtenstein, Vienna (Wild- 
enstein, No. 46) 

Metropolitan Museum, New 
York 

Louvre, Paris 

Louvre, Paris 

Museum, Brussels 

Museum of Occidental Art, 
Moscow (Venturi 477) 
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Appendix V 


TABLE FOR CHAPTER V 


A ee IIR HERI ERG 
“Like” of the Different Categories 


Reliabilities, Means, and Standard Deviations for 
in 8.2a for a Population of 542 Students (261 Boys, 281 Girls) in the 71th Grade 
No. of " No. of i . 
Items in Category Maa Sigma т Items in Category маша iow 5 
Category a ы Category ы d 
24 Soc. Total| 39.3 | 27.2 |.92 16 Home  Total| 50.8 | 28.6 .88 
Sci. Boys | 41.6 | 27.6 Econ. Boys | 30.8 21.5 |.80 
Girls | 37.1 | 25.4 Girls | 69.4 | 22.2 |.81 
16 Biology Total| 45.6 27.8 |.87 16 Ind. Total| 50.8 | 25.6 .82 
Boys | 44.4 | 29.4 Arts Boys | 59.2 | 26.2 
Girls | 46.6 | 26.3 Girls | 42.9 | 23.3 
16  |Phys. Total| 50.1 | 29.5 |.89 16 |Fine Total] 45.8 | 30.0 |.89 
Sci. Boys | 60.4 | 27.6 |.87 Arts Boys | 33.2 | 26.8 |.87 
Girls | 40.6 | 28.0 |.88 Girls | 57.4 | 28.2 |.88 
16 English Total| 48.7 | 26.4 .85 16 Music Total} 46.8 | 29.4 .89 
Boys | 39.5 | 27.0 Boys | 37.0 | 28.6 
Girls | 57.2 | 23.0 Girls | 55.8 | 27.3 
16 Foreign Total| 47.4 | 31.0 |.90 16 Sports Total} 55.2 | 23.5 .79 
Lang. Boys | 36.7 29.8 Boys | 56.8 | 24.3 
Girls | 57.2 | 29.0 Girls | 53.6 | 23.6 
16 Mathe- Total 36.3 | 29.0 |.89 38 Manipu- Total | 45.2 19.1 |.85 
matics Boys | 45-8 29.2 lative Boys | 41.6 | 19.2 
Girls | 27.6 | 25.6 Girls | 48.2 | 18.3 
16 Busi- Total | 55.9 | 23.6 .80 35 Read- Total| 47.0 | 22.6 |.90 
nes Boys | 56.8 | 24.3 ing Boys | 45.8 | 23.6 
Girls | 55.0 | 23.9 Girls | 48.0 | 22.5 
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Ranges, Means 
in 8.2b from a 


Appendix VI 
TABLES FOR CHAPTER VI 


» Standard Deviations. 
Random Sample of 1 


TABLE 


1 


> and Reliabilities o, 
000 Students 


of the Different Categories 


No. of 
Items in 
Category 


————À 


19 
25 


32 
16 
28 
10 
1s 
26 
16 
16 
16 
16 


Likes Dislikes 
Category | 
Range Mean Sigma к | Range Mean Siri г 
70 o с o 
_ SSS c LC CNN MAE p——— ee je 
9-95 (so | 3g o | as | ыо 36.2 | 19.4 | .76 
0-100 | 51.8 | 19,8 | о | 0-95 16.9 | 13.2 | 72 
0-100 | 49.3 | 19/4 | “84 | 0-90 21.4 | 14.0 | .78 
"| 8: 400°) 24.6 |.26.2:] 786 [отоо | £3 4 | 24.2 | .86 
:[9-99 | 48.5 | 20:2 | “gs 0-95 | 19.5 | 15.5 | .79 
ате Sex. . af 82100 | $5.0 |.23:2 | "65. | geo 21.9 | 17.3 | .59 
School Activities. -| 0-100 | 55:0 21.2 | .76 | 0-95 | isio | 14/8 | 68 
Authority... “| н 2353 | ui | оз. [улу 37.2 | 15.8 | .71 
Leadership... 0-100 | 35.0 | 22:8 | 779 | o-100 22.7 | 19.8 | .79 
Fantasy. 0-100 | 42.2 | 24:8 | 82 | ontop 21.1 | 19.6 | .79 
Magic, 0-94 | 26.1 | 2-4 | “a9 0 | 21.8 | .80 
Mystery, 0-100 | 39.8 | 31/8 | 777 
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(7th Grade through 12th Grade) 


APPENDIX $37 


TABLE 2 


Ranges, Means, Standard Deviations, and Reliabilities of the Different Categories 
in 8.2¢ from a Random Sample of 1000 Students (7th Grade through 12th Grade) 


Likes Dislikes 
No. of 
Items in Category | | 
Category Range Mean Sigma, r | Range Mera Sigma 
% | % Й % 
| | | 
14 Aggression.. 31.6 | 20.8 73 | 0-9; 32.5 | 20.2 | .71 
16 Severity. A б 33.5 | 18.6 70 | 0-95 | 29.9 | 16.6 | .64 
24 Life-Death-Universe..| 0-99 33.0 | 22.6 $6 | 0-99 | 26.9 | 21.0 | .86 
26 Preoccupation with 
Cleanliness. 47.2 | 18.6 78 | 0-80 | 22.6 | 13.6 | .73 
24 Humor..... 47.0 | 19.8 во | 0-80 | 21.9 | 15.4 | .76 
24 Seli-acceptance 0-95 | 42.5 | 19.4 | .78 | 0-85 | 28.7 | 15.8 | .72 
25 Methodical... ..| 0-100 | 42.0 | 20.2 81 -95 | 23.0 | 17.3 | .79 
16 Identification with | 
Others... eee 0-100 | 49.0 | 23.4 | .79 0-85 | 16.9 | 13.6 | .62 
16 Non-identification 
with Others. ..| 0-90 34.2 | 20.0 73 | 0-95 | 30.8 | 18.0 | .66 
18 Solitary.... 40.0 | 15.4 53 | 0-85 | 33.5 | 15.3 | .56 
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Ability, level of, in score interpreta- 
tion, 435-436 

Ability to Apply Social Facts and 
Generalizations test, 168-175; be- 
havior, analysis of, in, 172-173; 
criteria for appraisal, 173-174; ob- 
jective, analysis of, 168-169 

Achievement, ‘analysis of, in Appli- 

ation of Principles test, 104-111 

Achievement tests, inadequacies of 
early, 3-4 

Activities: out-of-school, 369: school, 
evaluation of, 368-369; records, 
use of, 166; records, reliability and 
validity of, 330 | 


Adaptation, role of, in adjustment, 
354 
Adjustability (see also Adjustment), 


social, 489, 
Adjustment: maturation and 
tion in, 354; meaning 0f, 
A до, optimum, 353-354 
ministration of Evaluation Pro- 
gram, 439-459 
Administrative problems in obtain- 
ing records, 449-450, 454 
Adolescents: interests of, 316; verbal 
expression of art statements by, 
279 n 
Aggression, evaluation of, 366-377 
Aims (see Objectives) 
Ambivalence: between 
specific values, 241-242; 
" beliet 431-432 
nalo; as type 0 
101" ҮР 


adapta- 
350- 


general and 
in social 


f behavior, 49, 


Analysis, power and habit of, in 
Behavior Description, 480 
Analysis of Controversial Writing 
test, 150-154; conclusion concern- 
ing, 154; scoring. 152-154; sample 
problems in, 151-152; criteria for 


selecting items, 150 
539 


Lh EAA III 


“Anecdotal method” of recording, 
466 
Anecdotal records: criteria for se- 
lecting, 163 n; inadequacies of, 
in evaluating art appreciation, 279; 
social sensitivity, 160-161, 163-164 
Application of principles of logical 
reasoning: evaluation instruments, 
development of, 114-122; objec- 
tive, analysis of, 111-114 (see 
also Application of Principles of 
Logical Reasoning test) 
Application of Principles of Logical 
Reasoning test, 111-126; readiness 
of class for, 126; sample problem 
in, 119-121; scores, summary and 
interpretation 0f, 122-124; state- 
ments, kinds of, in, 121; validity 
and reliability of, 124-126 
Application of Principles test, T1- 
111; analogy of statements in, 101; 
authorities, statements of, in, 101; 
construction of, 80-111; data sheet 
sample, 102; directions for, ex- 
ample, 88-89; errors in responses, 
83-84; essay-type vs. objective 
form, 84-85; problem situations 
for responses 


in, 80-111; reasons 
in, 82-84; sample pro lem, 89-90; 
scores, summary and interpreta- 
tion of, 103-111; social values 
tested in, 95-101; types of re- 
sponses in, 81-82 

Application of Science Principles 
(see Application of Principles) 

Applying Social Facts and Generali- 
zations to Social Problems test 
(see also Ability to Apply Social 
Facts and Generalizations), 175, 
191-208; behaviors evaluated in, 
а description of, 198-203; 
edes cte ipm 
es 199-208; uses 
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Appraisal (see Evaluation, Reports, 
Tests) 

Appreciation, Aspects of (see also 
Appreciation of Art, of Literature, 
of Social Values), 245-312 

Appreciation of art (see also Art): 
evaluation of, 276-312 

Appreciation of literature, 246-276; 
meaning of, in the Study, 246; 
behaviors in, 249; test of, 250-276 

Appreciation of social values, use 
of, 240 

Areas of activity (see also Areas 
of Living), interest tests and, 318 

Areas of Living, interests, role of, 
in, 317-318 

Areas of thought in Behavior De- 
scription, 485 

Argumentum ad Hominem principle, 
112 

Art (see also Art Appreciation, 
Painting, Art Experience, expres- 
sion, sensitivity), interest in, tests 
of, 277 

Art Appreciation (see also Art Ex- 
perience): assumptions concern- 
ing, 283-285; A of, 276- 
312; meaning of term, 280- 
281, 283-284; objectives, 276- 
277; psychology of, 280-283; rec- 
ords, inadequacies of most, 279; 
test (see Art Appreciation test) 

Art Appreciation test (see also Find- 
ing Pairs of Pictures test): ad- 
ministration of, 299-300; assump- 

tions underlying, 283-285; criteria 
for, 279-280; 287-289; description 
of, 289-292; development of, 283- 
289; interpretation of, 292-299; 
reliability of, 300-301; score range, 
304; scoring, 292-294; use of, 306. 
307; validity of, 301.303, 805. 
306 

Art Experience (see also Art): and 
creativity, 282-283; emotional re- 
action in, 285; meaning of, 281; 
methods of data gather, 
278; nature of, 285 
role in, 281-283 


ing, 277- 
5 Spectator’s 


INDEX 


Art expression and “Gestalt” psy- 
chology, 280-281 

Art History as an Academic Study, 
283-284 

Art sensitivity, meaning of, 283- 

Art teaching, purposes of, 27 

Art test (see Art Appreciation 

Art values, sensitivity to, 276- 
278 

Art and verbal facility, 278-279 

Artist's reactions, 283-284 

Arts (see Art, Dramatics, Theater) 

Aspects of Appreciation (see Ap- 
preciation ) 

Assumptions, basic, of Evaluation 
Staff, 11-15 

Assurance in Behavior Description, 
485 

Authority, reactions to, 370 


Background data in one case study, 
409-410 

Battery of instruments, reasons for, 
406-408 

Behavior: central pattern of, 430- 
431; classifications, 351-352, 484- 
485; combinations of, determined, 
433-434; descriptions (see Be- 
havior Description); deviant, hy- 
potheses concerning, 431-432; 
motivation, role of, in evaluating, 
351; objectives defined in terms ot, 
19-20; organic unity of, 7, 405; 
partens 11-12, 13, 19-20 

Behavior Description, 470-487; ad- 
vantages of, 486-487; c fica- 
tions in, 484-485; on college-en- 
trance blank, 496-497; Commit- 
tee on, 470; data interpretation, 
functions of, in, 403-404; Manual, 
485-486, 496; records, 279, 466, 
471, 474-487, 493; in subject 
fields, 501-502 

Belief: as type of social attitudes, 
205; instruments, 208-209 

Beliefs About School Life test, 208, 
229-234; results of, in one school, 
437 

Be 


liefs on Economic Issues, charac- 
teristics of, 235-236 


INDEX 


Beliefs on Economic Issues test, 208, 
$ 238 

Beliefs on Housing, 209 

Beliefs on Social Issues test, 
234; consistency evaluated in, 
data sheet sample, 2 
scription of, 215 
of, 209-215; honesty in, 
language's role in validit 
reliability studies of, 22. 
statements in, 216-217; sampling 
and statement formul 
212; score patterns 
interpretation of, 2 
and summarizing results, 217 
uncertainty evaluated in, 
validity and reliability of, 225- 
229 

Beyond data, 55, 56, 62, 408 

Bibliography of evaluation instru- 
ments, 21- 

Biology (see S 
Birth-Life-Death” fantasies, 370 


208- 


Carnegie Foundation for the Ad- 
vancement of teaching, 494 n 

Carroll, Herbert, 246 i 

Case study, one, based on test data, 
408-499 

Caution score, 55 

Changes (see also Growth, Student 
Growth); behavior, as educational 
objective, 11-12; diagnoses of, by 
tests, 242-243; in school practices, 
resulting from evaluation, 457; in 
School programs, resulting from 
evaluation, 436-437; in students, 

evaluation as check of, 436 

Che s: data summaries of, in 
reading interests, 334-337; validity 
of, 333-334 

hemistry (see Science) 
assroom situations as source of 
evaluation data, 446 

lassroom teacher (see Teacher) 

Cleanliness, preoccupation with, 366 
Clear thinking" objectives, 35-37 

College: changes in information for 
admission to, 495; Committee on 
Admission, report to, 494-498; 
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"Junior Year" blank for, 498; trans- 
fer from school to, form, 494-498 

Committee: on Admission to Col- 
lege, report to, 494-498; on Evalu- 
ation in the Arts, 276; on the 
Evaluation. of Interests, 313; on 
the Evaluation of Interests and 
Appreciations, 245; on the Evalua- 
tion of Reading, 247; on Evalua- 
tion and Recording, xx; on the 
Interpretation of Data, 38; on Re- 
ports and Records, 464; on School 
and College Relations of the Edu- 
cational Records Bureau, 495; on 
the Study of Adolescents, 349 

Community (see also Home, Parents, 
Public ` Relations), — evaluation's 
role in school's relations with, 10 

Compulsiveness, evaluation. of, 366 

Conservatism: beliefs, 213; in social 
beliefs, 217-219; terms, as indi- 
cating direction, 218, 217, 920 

Consumer aspect of applying logical 
principles, 114 

Content, course, as means to ends, 
11 

Controversial Writing test, analysis 
of, 150-154 

Cooperative planning (see also Eval- 
uation program planning), 440- 
442 

Counselor (see also Teacher): In- 
terest Questionnaire, value of, to, 
345-347, 396-399; interpretation 
of evaluation data by, 452; inter- 
views with, in one case study, 413- 
417 

Course 
27 


revision, evaluation in, 26- 


Creation (see also  Creativeness), 
meaning of, 474 

Creativeness: in art experience, 282- 
283; in appreciation of literature, 
248, 251; characteristics of, 474- 
475; evaluation of, 475-476; and 
Imagination in Behavior Descrip- 
tion, 474-476, 478 

Critical-mindedness in Reading of 
Fiction test, 265-267 
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“Critical” thinking (see Clear think- 
ing, Logical Reasoning ) 

“Crude errors” in data interpreta- 
tion, 47, 55 

Cultural activities in 
study, 417-418 

Curiosity in Appreciation of Litera- 
ture test, 248, 251 

Curriculum: based on hypotheses, 
7-8; changes in, resulting from 
evaluation, 436-437; effectiveness 
of, appraised, 453-454; improve- 
ment of, one purpose of evalua- 
tion, 403, 432-436; Reading ques- 
tionnaires in appraising, 275-276; 
and school program (see School 
program ) 


one case 


Dale, Edgar, 828 n 

“Dartmouth Visual Survey,” 473n 

Data (see also Evaluation Data); 
classifications of, 41-42; criteria for 
selection, 42-43; dependability of, 
evaluating, 40; interpretation of, 
38-76; kinds of, for inter reta- 
tion, 41-43; presentation of, orms, 
41; selection and use of, 31-32; 
Sources of, 42 


Deductive thinking, 78 

Definitions principle, 119 

Democracy (see also Democratic): 
as interest area in soci 
209; liberalism and conservatism 
regarding, 217; in School, 229 

Democratic: meaning of term, 183; 
attitudes, evaluation. data useful 
in developing, 457; tenets (see 
also Social Problem values), 175, 
179; values appraised in Social 
` Problems test, 183-184, 187 

Descriptive Trait Profile, 358, 383- 
384, 388 

Devices (see Instruments, Tests) 

Directing Committee of the Study, 
8-4 

Drama Questionnaire, 253, 264 

Dramatics, interest in, 371-372 

Drives and impulses, organization 


of, 364-367 


al issues, 


INDEX 


Economic issues (see also Beliefs 
on Economic Issues test), beliefs 
on, 234-238 

Economic relations: as interest area 
in social issues, 209; libe m 
and conservatism regarding, 217- 
218 

Education: continuity of, 494-495; 
purpose of, 11 

Eells, Walter Crosby, 326 

Emotional adjustment fostered by 
the arts, 276 

Emotional control in Behavior De- 
scription, 485 

Emotional disposition 
demic” interests, 396 

Emotional Responsiveness, in Be- 
havior Description, 481 

Emotional tendencies, interpretation 
of (see Interests and Activities 
Questionnaire, interpretation of) 

“Empathy,” 280 

Environment and individual, rela- 
tionship, 468 

Essay-type test: criticisms of, 84-85; 
and Form 2.52, correlation be- 
tween, 67-73 

Esthetic experience, 280, 283 

Evaluating, habit of, 33 

Evaluation (се also Evaluation 
Data Tests): continuity of, essen- 
tial, 438, 449; complexities, rea- 
sons for, 6-7; definition of, in the 
Study, 5; influences of, on teach- 
ing and learning, 14; interpretation 
of (see also Interpretation), G, 25- 
28; methods, selection of, 91-23; 
role of, in educational process, 
29-30; purposes of, 7-11, 403, 432- 
437; results of, use of. 454-459; 
School's responsibility for, 14; 
traditional, inadequacies of, 146; 
whole-faculty responsibility for, 
438 


and  "aca- 


Evaluation adviser, 458 

Evaluation of Art Appreciation (see 
also Art Appreciation), 276-312 

Evaluation data: assumptions under- 
lying, 405-408; available in plan- 
ning program, 445; case study 


INDEX 


illustrating synthesized, 4 
circulation of, 444-447, 
collection of, method: 
attitude toward, 
ance, 430-43 interp tion of, 
and teachers, 437- nterpreta- 
tion and user of, 403-438; nature 
of, 405-408; sources of, 446; sum- 
marizing and circulating, 449- 
454; synthesized, cas study illus- 
trating, 408. uses and interpre- 
tation of, 403-438 

Evaluation devices (see also Evalua- 
tion instruments, Tests), develop- 
ing and improving, 23-25 

Evaluation Instruments (see also 
Tests): Bibliographies of, 21-22; 
development of, 43-60; need for 
new, 4 

Evaluation of Interests (see also In- 
terests), 313-348 

Evaluation. of Personal and Social 
Adjustment (see also Personal and 
Social Adjustment test), 349-402 

Evaluation Program: concept of, by 
teachers, 442; division of labor in, 
28-29; interpretation and uses of 
data in (see also Data, Interpreta- 
tion), 403-438; as integral part of 
school, 459; limitations in plan- 
ning, 443-444; as method of 
teacher education, 30; misconcep- 
tions about, 442; needs served by, 
443; planning and administering, 
439-459; procedures in develop- 
ing, in the Study, 15-28; purpose 
of, 442; scope and emphasis of, 
441-444; summary of, 459; sum- 
mary of planning and administer- 
ing, 439 

Evaluation specialist, inadvisability 
of having, 440 

Evaluation Staff: basic assumptions 
by, 11-15; members of, 4, 5 

Evaluation techniques, wide range 
of, needed, 13-14 

Examinations (see Tests) 

Experimentation in creativeness, 475 

Extrapolation, 39, 45-46 
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Faculty (see also Counselor, School 
Staff, Teacher): attitude of, to- 
ward evaluation data, 455-456; 
continuity of study and collective 
thinking by, 454-456; participa- 
tion of whole, in evaluation pro- 
gram planning, 441, 457-458; re- 
sponsibility of whole, in securing 
data, 446 

' relationships, evaluation of, 


“Birth-Life-Death,” 370; 

ior, 370-371; in Interest and 
Activities Questionnaire, 364, 370- 
72 

Feelingtone, type of social attitude, 
205 


General accuracy, definition of, in 
test response, 51 

General science (see also Applica- 
tion of Principles test, Science), 
test construction for applying 
principles in, 80-111 

Generalizations (see also Application 
of Social Facts and Generaliza- 
tions), testing for formulation of, 
24-25 

“Gestalt” psychology and art expres- 
sion, 280-281 

Grades and awards (see also Marks, 
Reports), as area in Beliefs about 
School Life Test, 232 

Group: life, as area in Beliefs about 
School Life, 231; progress, proc- 
esses to estimate, 433-436 

Growth (see also Changes, Pupil 
growth); group’s, measure of, 433- 
436; individual, reports of, 489- 
490 

Guidance (see also Counselor) con- 
tinuity of fostered by records, 
465; evaluation data, use of, in, 
8-9, 430-432, 454-455; reports in, 
492; and transfer, recording for, 
463-504 

Gullibility, 153 


Habits, work, appraisal methods 
needed for, 31-33 
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Home reports (see also Parents, Re- 
ports), 488-493; records as bases 
of, 465; and teacher reports, iden- 
tical, 492 

Homeroom teacher, evaluation data 
summaries interpreted by, 452 

Hoskins, Luella, 329 n 

Housing, Beliefs on, test, 209 

Human Behavior (see Behavior) 

“Human Relationships” as area in 
Interests and Activities Question- 
naire, 364, 367-370 

Humor, activities in expressions of, 
372 

Hypotheses, validation of, as one 
purpose of evaluation, 7-8 


Identification: in appreciation of lit- 
erature, 248, 251; with others, 
evaluation of, 368 

If-then principle, 112 

Imagination in creativeness, 475 

Impressing others, activities in, 369 

Impulses and drives, organization of, 
às area in Interests and ctivities 
Questionnaire, 364-367 

Indirect argument principle, 112 

Inferences: in data interpretation, 
39; test to measure, 60-62 

Influence in Behavior Description, 
478-479 

Inquiring Mind in Behavior Descrip- 
tion, 479 


“Insight,” meaning of term, 398- 
399 


Instruments (see Evaluation Instru- 
ments, Tests) 

Intelligence, general, relation of, to 
Social Problems test results, 196 

Intercorrelation of scores in Interpre- 
tation of Data test, 59 n 

Interest (see also Interest Index, In- 
terest Questionnaire, Interests) : 
and appreciation, distinctions be- 
tween, 245; in Reading test, valid- 
ity and reliability of, 330-334 

Interest Index (see also Interest 
Questionnaire, Interests and Ac- 
tivities Questionnaire), 338-348; 


Interpolation “in 


areas in, 339; in one case study, 
415; data sheet sample, 341; in- 
terpretation of, 340-345; uses of, 
347-348 


Interest Questionnaire (see also In- 


terest Index, Interests and Activi- 
ties): analysis of, unple, 3775 
and chec! 337; construction 
of, 338-340; use of, in developing 
personality test, 358-359, 360- 
361; value of, to counselor and 
teacher, 345-347 


Interests (see also Interest, Interest 


Index, Interest Qu 
terests and Activi Recreational 
Interests): “academic”, and emo- 
tional dispositions, ^96; adolescent 
vs. adult, 316; data sources for 
revealing, 313; evaluation of, 313- 
348; as index of personality pat- 
tern, 359; as means and ends, 
313-314; objectives, analysis of, 
313-318; questionnaire (see also 
Interest. Questionnaire), 338-848; 
patterns of, as revealed by che 
lists, 334-337; recreational (see 
Recreational Interests); signifi- 
cance of, in personality evalua- 
tion, 359-360; uniqueness of, 344 


ionnaire, In- 


Interests and Activities Question- 


naire (see also Interest Question- 
naire): administration of, 400; 
areas in, 364; categories in, 863- 
372; criteria for item selection in, 
362-363; drives and impulses, Of 
ganization of, as area in, 364-367; 
interpretation of, 372-384, 390- 
392; interpretation of, to students, 
399; validity of, 887-396 

interpreting. data, 


Interpreter, importance of, in tests, 


154-155 


Interpretation (see also Interpreta- 


tion of Data): ability to make 
original, 67-74; ability to judge 
by others, 65-67; behavior descrip- 
tions, one function of, 403-404; 
functions of, 403-405; over-all, by 
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staff member, 452-453; overgen- 
eralized, 46; undergeneralized, 47 
Interpretation of Data (see also In- 
terpretation of Data test), 38-76; 
accurate, 46; classifications of 
types of, 46-47; original vs. stated, 
40-41, 65; types of, 46 
Interpretation of Data test, 47-60; 
appropriateness of, for high-school 
level, 67; construction of, 48-51, 
66; form of, for junior high school, 
63-65; forms of, 67-73; reliability 
of, 74-76; response patterns to, 
73; validity of, 65-76 
Interpretation and Uses of Evalua- 
tion Data, 403-438 
Interschool Committee, 28-29 


Judging the Effectiveness of Writ- 
ten Composition test, 265, 267- 
268; Junior High school: Applica- 
tion of Principles test for, 91; n; 
Interpretation of Data test for, 
63-65 

"Junior Year" blank, 498 


Kuder-Richardson formula, 65 


Labor and unemployment: as inter- 
est area in social issues, 209; lib- 
eralism and conservatism regard- 
ing, 218 

Language (see also Words): choice 
of, in statements of social beliefs, 
210-212 

Leadership, activities in, 369 

Learning, influence of evaluation on, 
14 

Liberalism: meaning of term, 213, 
217, 220; in soeial beliefs, 217- 
219 

Life, philosophy of, appraised, 34 

Literature (see also Appreciation of 
Literature, Recreational Interests), 
appreciation of, 246-276 

Logical reasoning: behaviors in, 112- 
113; meaning of, 111-112; test of 
(see Application of Principles of 
Logical Reasoning) 
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Magazines (sce also Reading maga- 
zines): checklist of, 326; classifica- 
tion of, by types, 326 

Maladjustment (see also Adjustment, 
Personal and Social Adjustment), 
kinds of, 353 

Manipulation in creativeness, 475 

Marks (see also Grades and Awards, 
Home Reports, Parent Reports, 
Reports, Teacher Reports): for 
college admission, inadequacies of, 
488-489, 494; and interests, 316; 
as objectives, 494; in records and 
reports, 467, 468 

Maturation, role of, in adjustment, 
354 

Methodical activities, evaluation of, 
366 

Methods, evaluation: means to ends, 
11; selection and trial of, 21-23 

Militarism: as interest area in social 
issues, 209; liberalism and con- 
servatism regarding, 218 

Motivation: personal and social ad- 
justment study yields insight into, 
401; role of, in evaluating be- 
havior, 351 

Movies, checklists for 


revealing 


recreational interests regarding, 
328-39! 

Mystery-interests, 371 

Nationalism: as interest area in 


social issue, 209; liberalism and 
conservatism regarding, 219 

Nature of proof, 36; assumptions in, 
129; behaviors in achieving, 129; 
definition of, 127-199; objective, 
analvsis of, 126-130; senses in ar- 
riving at, 128; test (see Nature of 
Proof test) 

Nature of Proof test (see also Na- 
ture of Proof), 126-148; check on 
responses to, 144-147; develop- 
ment of, 130-141; sample prob- 
lems in, 132-134, 136-139; scores, 
summary and interpretation of, 
141-143; structure of, 135-141; 
validity and reliability of, 143- 
148 
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Objectives: agreement on, needed for 
EUM dos analysis of, 38-43; 
application of logical reasoning, 
analysis of, 111-114; application 
of scientific principles, 77, 111; an 
evaluation program, areas of, 406; 
"breaking up," 405; changes in 
behavior patterns as, 11-12; classi- 
fication of, 16-18; “clear thinking" 
as, 35-37; "comprehensive," 406; 
concern in evaluation, 5; defining, 
in terms of behavior, 19-20; 
evaluation data collection regard- 
ing, 444-446; evaluation program 
as a check on achievement of, 432- 
436; formulation of, 15-16; gen- 
eral and Specific, relation between, 
441-449; of growth reports, 489- 
490; illustrations of, 12; "intangi- 
ble," 439; interests as, 317-318; 
interests and appreciations as, 5; 
limited overemphasis On, reasons 
for, xvi; marks as, 494; propa- 
pun analysis, 149-150; record 
orms for, in subject fields, 500- 
502; in records, 463-469; re-ex- 
amination of, essential, 16; re- 
formulation of, 30; selection. of, 
basis for, 15-16; situations showing 
achievements of, 20-21; state- 
ments of, inadequacies of, xv; in 
subject fields, study of, 499-500; 
teacher consideration of, 465-466; 
types of, 18; working, for records 
and reports, 467-469 

Omissions, scoring of, in test, 55 

Open-mindedness in Behavior De- 
scription, 479.480 

Organization of Impulses and Drives, 
as area in Interests and Activities 
Questionnaire, 364-367 

Out-of-school activities, evaluation 
of, 369 


“Overcaution” in data interpretation, 
47 


“Overcritical” students, 97 


Painting (see alga Appreciation of 
Art, Art), field of, chosen for art 
test, 286 


Parents (see also Community, Home 
Reports): participation of, in sug- 
gesting areas of beliefs, 207; 
reading questionn s, results of, 
for parents, 275; reports to, 488- 
493; reports of evaluation useful 
to, 456; security of, fostered by 
evaluation, 9-10 

Pencil-and-paper tests, 44; use of, 
in collecting evaluation data, 446- 
447 

Personal adjustment (see also Ad- 
justment, Maladjustment, Person- 
ality): meaning of, 350; and social 
adjustment (see Personal and So- 
cial Adjustment) 

Personal and social adjustment (see 
also Interests), 206; appraisal, 
techniques of, 35 358; cleanli- 
ness, preoccupation with, in, 366; 
differentiation between, 350; eval- 
uation of (see also Personal and 
Social Adjustment test), 349-402; 
interests, significance of, in, 359- 
360; objective, history of, FE 
20; summa regarding, " 
102 ту reg іч 


Personal and Social Adjustment test: 
characteristics, desirable, of, 355- 
358; Interest Questionnaire, use of, 
in developing, 358-359, 360-361; 
Interests and Activities Question- 
naires for, 361-402 

Personality (see also Adjustment, 
Personal Adjustment, Personal and 
Social Adjustment): one case 
study of, 376-884; information 
about, need for, xix; meaning of 
term, 350; measurement of, 351 
$5 projective methods for study- 
ing, 36; rating scale, 358 

Philosophy and objectives underlying 
recording, 463-469 

Philosophy of life, appraisal of, 34 

hysical energy in Behavior De- 
Scription, 485 

hysics (see Science) = 

Planning and administering the eval- 
uation Program, 439-459 

Prejudices in social attitudes, 206 


-— 
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Pre-tests, 238, 243 

Primitive drives and impulses, 364- 
365 

Principles of logical reasoning (see 
Application of Principles of Logi- 
cal Reasoning tests, Logical rea- 
soning ) 

Programs (see also Evaluation pro- 
gram), evaluation data шей їп 
making out, 456 

Proof, nature of (see Nature of 
proof ) 

Propaganda: definitions of, 149; 
analysis, 148-154; behaviors re- 
lated to, 149-150 

Public relations (see also Commu- 
nity, Home, Parents), evaluation 
as a basis for, 10 

Pupil (see also Student): develop- 
ment in subject fields, 499-504; 
growth, objectives of, classifica- 
tion of reports on, 490; and teacher 
relations, 231 

Purposes (sce Objectives) 


Qualitative vs. quantitative under- 
standing, 82 

Questionnaire; techniques, assump- 
tions in, 252; on voluntary read- 
ing (see Questionnaire on Volun- 
tary Reading ) 

Questionnaire on Voluntary Reading, 
253-264; criteria for item selec- 
tion on, 255-257; data sheet, sam- 
ple, 259; description of, 253-257; 
Scoring, 258-264; summarizing, 
257-264; use of, 278-275 


Race: as interest area in social issues, 
209; liberalism and conservatism 
regarding, 218 

Radio: checklists for revealing recre- 
ational interests, 398, 329-330; 
preferences, 329-330 

“Rating” (see also Grades and 
awards, Marks), 486 

Reading (see also Appreciation of 
Literature, Reading Record, Recre- 
ational Interests): fiction, classifi- 
cation of, by type, 322, 324; check- 


list of, interests, 334-337; maga- 
zines, 325-327; non-fiction, classi- 
fication of, 322, 325; points, 45; 
reactions to (see Reading reac- 
tions, Reading Reactions b ead 
naire), records (see Reading rec- 
ords); voluntary (see Reading 
Questionnaire ) 

Reading reactions (see also Reading 
Reactions Questionnaire): evalu- 
ation, need for, 249, 252; meaning 
of, 248-249; synthesis of data in 
one case study, 417-418; tests of, 
265; types of, 248-249, 251-252 

Reading Reactions Questionnaire, 
250-273; "direct" forms of, 271- 
272; direct observations and ques- 
tionnaire techniques, difference 
between, 269-270; student honesty 
in, 269-270; uses of, 273-276; 
validity of, 268-273 

Reading records: for revealing in- 
terests, 319-325; samples of classi- 
fication in, 322; use of, 166 

Reasoning, logical (see Application 
of principles of logical reasoning, 
Logical reasoning) 

Record forms: objectives for, 467- 
469; for objectives in subject fields, 
500-501; purpose of, 464 

Record keeping, decentralized, in- 
adequacies of, 450-451 

Records (see also Behavior Descrip- 
tion, Record forms): activities, 
166; activity, validity and re- 
liability of, 330; behavior, 206 n; 
observational, 43, 449-450; read- 
ing (see also Reading records), 
319-325; and reports, objectives, 
467-469 

Recreational interests: areas of, 318; 
checklists, use of, 334-337; maga- 
zine checklist for revealing, 325- 
326; movie checklists for reveal- 
ing, 328-329; newspaper question- 
naire for revealing, 327-328; radio 
checklist for revealing, 828-330; 
reading record for revealing, 319- 
325; validity and reliability of 
tests for, 330-334 
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Reflective thinking (see also Appli- 
cation of Principles of Logical 
Reasoning), process of, 36 . 

Relationships: family, evaluation of, 
367; human, as àrea in Interests 
and Activities Questionnaire, 364, 
367-370; with opposite sex, evalu- 
ation of, 367; with same sex, eval- 
uation of, 367; in social values 
175 

Reliability of scores, 63 

Religion, beliefs on, 208 n 

Report cards (see Report forms) 

Report forms (see also Reports): 
490-491; traditional, 488 

Reports: objectives of, 489-490; to 
parents, 456, 480-493; on upil 
growth, classifications of o jec- 
tives in, 490; records as basis and 
part of, 405 

Reports and Records: Committee on, 
464; objectives for, 467-469 

Responsibility-Dependability in Be- 
havior Description, 477-478 


> 


Sampling, as type of data interpre- 
tation, 46 


Satisfaction, evaluation of, in Ap- 
preciation of Literature test, 248, 
251 

Scales of Beliefs (see also Beliefs); 
207, 239; on economic issues (see 
Beliefs on Economic Issues test); 
on social issues (see Beliefs on 


Social Issues test); uses of, 240- 
241 


Schedule for testing, 447-448 

School: democracy in, 229; evalua- 
tion, responsibility of, for, 14; 
evaluation of activities in, 368. 
369; government as area in Be- 
liefs about School Life, 230-231; 
life, beliefs about (see also Be- 
liefs about School Life), 208; ob- 
jectives (see Objectives, school); 
program (see also Curriculum), 
changes in, 436-437; program, 
hypotheses underlying, 436; re- 
sources, ^ evaluation program 
limited by, 443-444; spirit, 933; 
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staff (see also Faculty, Teachers) : 

security of, fostered by evalua- 

tion, 9-10; training of, for in- 

terpreting evaluation results, 27- 

28 

Science principles (see also Appli- 
cation of Principles): application 
of, 77-111; meaning of, 78-80 

Score (see also Scores): “beyond 
data,” 55; caution, 55; crude er- 
Tors, 55; omissions in, 55; deriva- 
tion of, 54; general accuracy, 51 

Scores: analysis of, on Interpreta- 
tion of Data test, 56-60; intercor- 
relation of, 59 n; reliability of, in- 
creased, 63; students’ knowledge 
of, i idvisable, in Interests and 
Activities Questionnaire, 397-399; 
summary and interpretation of, 
in Application of Principles test, 
108-111 

Secondary school (see School) 

Security, psychological, fostered by 
evaluation, 9-10 

Self-reliance in Behavior Descrip- 
tion, 485 

Senses, use of, in arriving at proofs, 
128 


Seven Modern Paintings test, 307- 
312 

Short-answer tests, 47 

Social action, skill in 
dence of, 161-168 

Social adjustability: in Behavior De- 
Scription, 482; meaning of, 350 

Social attitudes: analysis of behavior 
in, 204-909; belief, as type of, 
205; Characteristics of, 205; defini- 
tion and classification of, 161, 203- 
209; evaluation of (see Applying 
Social Facts and Generalizations); 
expressions of, 206-207; feeling- 
tone as type of, 205; tendency to 
act, as type of, 205 

Social awareness, meaning of, 161 

Social beliefs (see also Beliefs on 
Social Issues): ambivalence in, 
possible reasons for, 431-432; 
areas of, 207-209; characteristics 
of, 212-914; Conservatism іп, 217- 


securing evi- 
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219; consistency of, 213-214; in- 
struments of evaluation, 208-209; 
liberalism in, 217-219; scale con- 
struction of, 214-215; about school 
life (see Beliefs about School 
Life); statements of, language in, 
210-212; test (see Beliefs on So- 
cial Issues, Social sensitivity); 
threshold in statement of, 210 

Social consciousness (see Social sen- 
sitivity ) 

Social generalizations, 169-172 

Social information, meaning of, 161 

Social issues (see also Beliefs on So- 
cial Issues): areas of, 208 
interest in, 209-210; direction of 
positions toward, 212-213; test 
(see Beliefs on Social Issues test) 

Social Problems test (sce also Beliefs 
on Social Issues test): comprehen- 
Siveness appraised, 174, 184-186; 
consistency appraisal in, 174, 184, 
189-190; criteria for choosing 
items in, 176; data sheet sample, 
185; democratic values appraised 
in, 183-184, 187; development of, 
177-184; intelligence, datos of, 
to results of, 196; key for, 182- 
183; logical aspects, interpretation 
of, in, 189; rationalization ap- 
praised in, 188; relevance ap- 
praisal in, 174, 187; results of, 
related to interests, 318; results, 
summarized, 184-190; scoring, 
validity of, 191-192; structure for, 
176; student interviews, as v 
lidity checks of, 194-195; te 
ers observations compared with 
results of, 193-194; use of, 240, 
241, 244; validity of construction 
of, 191; validity and reliability of, 
190-197; value patterns in, 177, 
179, 183, 189 

Social science, generalizations taught 
in, list of, 169-170 

Social sensitivity: anecdotal records 
in obtaining evidence of, 160-161, 
163-164; aspects of, 159-162; be- 
haviors involved in, 158, 159-162; 
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evaluation. of, 157-244; free-re- 
sponse tests in obtaining evidence 
for, 166; meanings of, 158-159; 
objectives, origin and scope of, 
157-159; pattern of, 166-167; 
students’ writings as means of se- 
curing evidence about, 164-166 

Social values (see also Beliefs on 

Social Issues test), 158-159; ap- 

plication of, 175-197, 406; appli- 

cation of, test construction on, 

175-180; behavior in applying, 

174; beliefs test, uses of, 238-244; 

tests (see Applic tion of Social 

Facts and. Generalizations, Social 

Problems); use of tests, 238-244 

Society, demands of, conforming to, 
352-353 

Solitary activities, evaluation of, 369 

Strong Vocational Interest Blank, 
318 

Student (see also Pupil) Background 
of, important in social tests, 233; 
behavior patterns, organization, 
12-13; development, evidence of, 
sources for, 444-449; interviews, 
as validity checks, 194-195, 331- 
333; participation of, in test con- 
struction, 207, 211; philosophy of 
life, appraisal of, 84; programs, 
evaluation useful in making out, 
456; scores on Interests and Ac- 
tivities Questionnaires, unwise to 
show to, 397-399; security fos- 
tered by evaluation, 9-10; self- 
observations in test construction, 
252 


Study, conditions for effective, 31; 
skills and work habits needing ap- 
praisal, 31-33 

Subject fields: objectives in, record- 
ing of, 499-500; record forms for, 
500-501, 503, 504 

Suggestibility, 152 


Teacher (see also Counselor, Facultv 
Teachers, Teaching): education 
through evaluation programs, 30; 
and pupils, relation, 231-232; rat- 
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ing and test results compared, 193- 
194; reports, 410-413, 492; train- 
ing, 459 

Teachers: concern of, in behavior de- 
velopment, essential, 238, 239; 
evaluation program, reaction, 20, 
432; insights translated into prac- 
tice by, 454-455; Interest Ques- 
tionnaire, value of, to, 345-347; as 
interpreters of evaluation data, 
458-459; objectives considered by, 
465-466; observations of, com- 
pared with test results, 193-194; 
realization. of objectives in sub- 
ject fields by, 499-500; security of, 
fostered by evaluation, 9-10; sub- 
ject-field forms useful to, 504; 
training of, in interpreting evalua- 
tion results, 27-98 

Teaching (see also Guidance): eval- 
uation data used in, 454-455; in- 
fluence of evaluation on, 14 

"Tension, inadvisable to point out, to 
student, 398 

Test (see also Evaluation, Tests): 
construction, 114-122; data, sum- 
mary of, in one case study, 415- 
429; responses, terminology de- 
scribing, 51; Scoring, traditional 
inadequacies of, 44; schedule for, 
447-448; situation, total, 156 

Tests: achievement, inadequacies 
of early, 3-4; allocation of, to 
faculty, 448; bibliographies of, 21- 
22; essay-type, criticisms of, 84- 
85; pencil-and-paper, 44, 162, 187; 
science, principles of applying, 77- 
111; readministration of, 155,156; 
short-answer, 47; structure of, in- 
terpreter's understanding of, 154- 
155; written, shortcomings of, 
14-15 

Theater arts (see also Dramatics), 
interest in, 371-373 


Thinking (see also Application. of 
Principles of Logical Reasoning, 
Clear thinking, Logical reasoning, 
Social thinking): ects of, 35- 
156; as objective, 3: 

Thirty Schools (see also School), re- 
sponsibilitv of, for evaluation, 3, 
14-15 

Thurstone, L. L., 214 

Time: effective use of for study, 31; 
recording data, economy of, in, 45, 
454 

Traits (see also Behavior, Behavior 
Description), 470, 473 

Transfer: Behavior Description card 
useful in, 486-487; to жүк? 
494-498; recording for, 465, 494- 
498 . 

Trends, recognition and compari- 
son of, 45-46, 49 


"Undemocratic" (see also Democ- 
tacy, Democratic), meaning of 
term, 183 

Units for college admission, inade- 
quacies of, 494 


Value: judgment, 45; pattern, ambiv- 
alence in, 244 

Values (see also Social Values ), gen- 
eral vs. specific, 241-249 

Verbal facility and art, 278-279 

Vocabulary, appropriateness of, in 
administering tests, 239 

Vocational tests, interests sampled 
by, 318 


Work habits: in Behavior Descrip- 
tion, 482; and study skills needing 
appraisal, 31-33 

Wert, James E., 327 

Whole-faculty (see Faculty ) 

Wickman, Е. K., 353 n 


Words, “people-describing,” 470- 
471 
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